Re: Box & whisker plots with skewed distributions?
- From: Bruce Weaver <bweaver@xxxxxxxxxxxx>
- Date: Thu, 03 Apr 2008 06:55:29 -0400
Jeff Miller wrote:
I'm just starting to use box & whisker plots
and have a (possibly very naive) question
about using them with skewed distributions.
The box represents skew in the middle of the
distribution nicely by showing the quartile
locations q1, q2, and q3. But the whisker
lengths are defined in terms of IQR=q3-q1, so
the whiskers are the same length for both tails.
This means that the skew in the tails (i.e.
below q1 or above q3) is not represented,
as far as I can see.
I wonder why the same value of IQR is used
to calculate the whiskers at both ends of the distribution.
Instead, I'd think the whisker should be longer on the side
with the longer tail. It would be easy enough to
define the whisker lengths for the two tails
separately, for example in terms of 2*(q2-q1)
and 2*(q3-q2). I'm just wondering why that
isn't routinely done. Have I overlooked
something obvious?
Thanks for your comments,
(By the way, with the distributions I am examining,
it would be very unhelpful to transform the scores
to eliminate the skew, since the skew itself is part
of what's interesting about the dataset.)
Hi Jeff. The following (from the Wikipedia page on box-plots) expands a bit on what Rich said.
* Any data observation which lies more than 1.5*(IQR) lower than the first quartile or 1.5*(IQR) higher than the third quartile is considered an outlier. Indicate where the smallest value that is not an outlier is by connecting it to the box with a horizontal line or "whisker". Optionally, also mark the position of this value more clearly using a small vertical line. Likewise, connect the largest value that is not an outlier to the box by a "whisker" (and optionally mark it with another small vertical line).
* Indicate outliers by open and closed dots. "Extreme" outliers, or those which lie more than three times the IQR to the left and right from the first and third quartiles respectively, are indicated by the presence of an open dot. "Mild" outliers - that is, those observations which lie more than 1.5 times the IQR from the first and third quartile but are not also extreme outliers are indicated by the presence of a closed dot. (Sometimes no distinction is made between "mild" and "extreme" outliers.)
So the maximum length of the whiskers is 1.5*IQR, but they can be shorter than that. In your case, the end with the long tail will likely have a whisker that is 1.5*IQR long PLUS some outliers and possibly extreme outliers.
--
Bruce Weaver
bweaver@xxxxxxxxxxxx
www.angelfire.com/wv/bwhomedir
"When all else fails, RTFM."
.
- Follow-Ups:
- Re: Box & whisker plots with skewed distributions?
- From: Jeff Miller
- Re: Box & whisker plots with skewed distributions?
- References:
- Box & whisker plots with skewed distributions?
- From: Jeff Miller
- Box & whisker plots with skewed distributions?
- Prev by Date: Re: Box & whisker plots with skewed distributions?
- Next by Date: Popular Meds. We got it all! hr3tn
- Previous by thread: Re: Box & whisker plots with skewed distributions?
- Next by thread: Re: Box & whisker plots with skewed distributions?
- Index(es):
Relevant Pages
|
Loading