Speaking Stata: Creating and varying box plots: Correction. Speaking Stata: Creating and varying box plots. You would need to do the transformation yourself and possibly fix the axisįor broader discussion of box plots within Stata, including how toĬreate your own variants on the default design, see Cox (2009, 2013). Time is a speed missing times can be recoded as zero speeds. Who do not complete should be assigned missing values. The distributions are often highly skewed and subjects
The calculation of median and quartiles and the selection ofĭata points for separate plotting need to be done afresh on any new scale.įor example, psychologists and others work with times taken by test subjects The same issue with box plots and change of scale arises with any nonlinear To overwrite it, as local macros are expendable. It may take a few iterations to get it right, but simply reissue graph box log10price, ylabel(`labels', angle(h)) The way to do it properly is thus to take logarithms first. Only for the minimum and maximum is there never a small problem. Strange and small, you would not usually be troubled by the difference, but Is exactly the same as the logarithm of the median. Points, and so it is not always true that, say, the median of the logarithms Each can be based on interpolation between data The same issue can affect, although usually not as much, the calculated Logarithms is not, in general, the logarithm of the interquartile range. The reclassifications reflect that the interquartile range of May jump out of that cluster and now be declared as deserving separate Thus some high values plotted separately may jumpīack inside the main box-and-whiskers cluster. Points declaredĪs deserving separate plotting on the original scale may not be so declared Logarithmic scale, you will often get a different decision. This calculation depends on the scale being used: if you redo it on a More than 1.5 times the interquartile range away from the nearer quartile. Most important detail is that a data point is plotted separately if it lies Stataįollows what Tukey (1977) settled on after trying various possibilities. Would reveal some they missed and others that have arisen since. Iglewicz (1989) cataloged several variants, and no doubt a careful search Methods for box plots differ by book and by program. Stupid thing to ask,” but will just give you a funny look that clearly Graph, rather like the kind of teacher who will not say, “That was a
When zero or negative values are present, it just gives you a ridiculous In what follows, I assume that each variable to be shown on a box plot isĪll positive, because logarithmic transformation is not defined otherwise.Īs you may have noticed, if you ask graph to use yscale(log)
MINITAB BOX PLOT SOFTWARE
Software design, whatever the statistical arguments, so if the pitfall to beĭiscussed here matters to you, then you will need to work your way around Yscale(log) have a special meaning for box plots would be bad Scale, which, with a box plot, is what you might want. What it does not do is recalculate summaries on the log Yscale(log) takes the graph you would have gotten otherwise and warps (From now on examples will be just in terms of graph box, as the principle isĪlthough Stata will let you do this, you should be aware of what option Realize that a logarithmic scale would be better for their data, and thenīy yscale(log) (with either graph box or graph hbox). Sometimes users fire up a box plot in Stata, The purpose of this FAQ is to point out a potential pitfall withĪnd to explain a way around it. How can I best get box plots on logarithmic scales?