Re: standard deviation and N-1
From: Rob Johnson (rob_at_trash.whim.org)
Date: 07/08/04
- Next message: Oscar Lanzi III: "Re: Right-handedness of a 3d cross-product?"
- Previous message: Michael Jørgensen: "Re: Is there more symmetry to this function?"
- Maybe in reply to: Victoria Florsheim: "standard deviation and N-1"
- Next in thread: Herman Rubin: "Re: standard deviation and N-1"
- Messages sorted by: [ date ] [ thread ]
Date: Thu, 8 Jul 2004 11:40:01 +0000 (UTC)
In article <Pine.GSO.4.05.10407072147070.8955-100000@hercules.acsu.buffalo.edu>,
Victoria Florsheim <vf2@buffalo.edu> wrote:
>In high school, I learned that the formula for standard deviation has n in
>the denominator, but in college the book has N-1 in the denominator. What
>is the reason for this?
>
>So far, I found this in my book (By Yates, Moore, McCabe):
>Why do we average by dividing by n-1 rather than n? Because the sum of
>deviations is always zero, the last deviation can be found once we know
>the other n-1. So we are not averaging n unrelated numbers. Only n-1 of
>the squared deviations can cary freely, and we average by dividing by the
>total by n-1. The n-1 is called the degrees of freedom of the variance or
>standard deviation.
>
>
>I sort of understand that, but could someone explain in simpler terms and
>expand on that? I'm still a little puzzled as to why n-1.
The point is that the sample mean, m_s, is not the distribution mean,
m_d. Suppose the distribution variance is v_d. n m_s is the sum of n
variates (the sample). Recall that the mean and variance of a sum of
variates are the sums of the means and variances of the variates. That
is, the mean of the sum of the sample is n m_d and the variance of the
sum of the sample is n v_d. In other words,
2
E[ (n m - n m ) ] = n v [1]
s d d
or equivalently,
2 1
E[ (m - m ) ] = - v [2]
s d n d
Write the distribution variance as
v
d
n
1 --- 2
= E[ - > (x - m ) ]
n --- k d
k=1
n
1 --- 2 2
= E[ - > ( (x - m ) + 2(x - m )(m - m ) + (m - m ) ) ]
n --- k s k s s d s d
k=1
n
1 --- 2 1
= E[ - > (x - m ) ] + - v
n --- k s n d
k=1
1
= E[ v ] + - v [3]
s n d
where v_s is the sample variance. Solving [3] for v_d, we get
n
v = --- E[ v ] [4]
d n-1 s
This is why, to compute the distribution variance, we multiply the
sample variance by n/(n-1). Thus, it appears as if we are dividing by
n-1 instead of n.
Rob Johnson <rob@trash.whim.org>
take out the trash before replying
- Next message: Oscar Lanzi III: "Re: Right-handedness of a 3d cross-product?"
- Previous message: Michael Jørgensen: "Re: Is there more symmetry to this function?"
- Maybe in reply to: Victoria Florsheim: "standard deviation and N-1"
- Next in thread: Herman Rubin: "Re: standard deviation and N-1"
- Messages sorted by: [ date ] [ thread ]
Relevant Pages
|