Re: std deviation and median





health inc. wrote:
> Good morning,
> when SPSS compute the standard deviation. It means a standard deviation of
> mean of median ?
> I have a median = 3000 and mean = 6000 and std. dev. = 12000. I want to
> delete some extreme values (+/- 2 std. deviation), it means that I should
> delete all values > 24000 (12000 x 2) or by median 27000 (3000 + 24000), or
> by mean 30000 (6000 + 24000) ? which would I consider ?
> Thank you very much.

Yikes! Bofore someone tells you otherwise, first of all, DELETING
any observation without assertaining it's an error that should be
deleted has been said by me to be a "statistical crime". Some
argued that while it's an improper thing to do, it may or may not
qualified to be called a "crime". :-)

In your case, contemplating deleting values on the sole reason that
they are outside of 2 std deviation is definitely a pre-meditated
crime!

***** DON'T DO IT *****

regardless how a standard deviation or its more robust equivalents
are defined.

The standard deviation s is computed around the mean xbar, so *-
two std deviations is xbar +- 2 s.

Some use the sample median instead of xbar for a measure of
dispersion, but that's an entirely different matter.

If you have one or two EXTREME outliers (even outside 3 or 4
std dev MAY not qualify as an "extreme" outlier), you may see
what ACTUAL effect it has (or they have) by deleting them
TEMPORARILY and redo the same analysis -- that's the idea
behind "regression diagnostic" of the ACTUAL effect of outlying
observations. But such outliers must not be casually deleted.

Even if there's sufficient justification (on grounds other
than being far from the sample mean) to delete it, it's always
a good idea to present BOTH sets of results, with and without
the deleted observation.

Deleting observations because they don't fit certain assumed
probabiity model is one of the MOST FREQUENT malpractice by
those dealing with statistical data.

-- Bob.

-- Bob.

.



Relevant Pages

  • Re: std. deviation
    ... it means "standard deviation of the sample" ... Deleting values outside +/- 2 std deviations would almost always be ... Prev by Date: ...
    (sci.stat.consult)
  • Re: Standard deviation?
    ... of with probabilities and the deviation. ... Assume that the standard deviation is 0.5 for some ... If the median is 0.001 then the data is very ... summarize. ...
    (sci.math)
  • Re: Programmatically finding "significant" data points
    ... I need an algorithm to extract the "significant" high and low ... How do I sort through this data and pull out these points of ... points which are more than N-times the standard deviation from the median. ...
    (comp.lang.python)
  • Basic Questions on Z-tables and Sampling
    ... adult male is exactly 70 inches with a standard deviation of exactly 3 ... 95% of the population is actually this (for a sample of 10,000 and Xbar ...
    (sci.math)
  • Re: BASIC Q: Why not use median-based std deviation?
    ... > instead of the mean in calculating the standard deviation? ... especially in highly skewed distributions. ... of distributions and test statistics. ... compared to the median. ...
    (sci.stat.math)