Re: Question of merging of variance
- From: "Ray Koopman" <koopman@xxxxxx>
- Date: 15 Mar 2007 00:54:54 -0700
Ole Dahl Rasmussen wrote:
On Mar 14, 7:31 pm, "Ray Koopman" <koop...@xxxxxx> wrote:
Ole Dahl Rasmussen wrote:
Dear group
I have a question which I actually thought was simple, but still
puzzles me.
In short, I wish to compute the total variance for two separate groups
from the same sample, knowing their size, means and individual
variances only. The groups are different in size, and thus I need some
kind of weighting.
I have found the below description at Wikipedia, but I am not sure
whether it holds, and it does not mention a technique for weighing.
Looking forward to hearing your thought.
Kind Regards,
Ole Dahl Rasmussen
- - -
Suppose that the observations can be partitioned into subgroups
according to some second variable. Then the variance of the total
group is equal to the mean of the variances of the subgroups plus the
variance of the means of the subgroups. This property is known as
variance decomposition or the law of total variance and plays an
important role in the analysis of variance. For example, suppose that
a group consists of a subgroup of men and an equally large subgroup of
women. Suppose that the men have a mean body length of 180 and that
the variance of their lengths is 100. Suppose that the women have a
mean length of 160 and that the variance of their lengths is 50. Then
the mean of the variances is (100 + 50) / 2 = 75; the variance of the
means is the variance of 180, 160 which is 100. Then, for the total
group of men and women combined, the variance of the body lengths will
be 75 + 100 = 175.
In a more general case, if the subgroups have unequal sizes, then they
must be weighted proportionally to their size in the computations of
the means and variances. The formula is also valid with more than two
groups, and even if the grouping variable is continuous.
The formula has as consequence that the variance in the total group
can not be smaller than the mean of the variances in the subgroups. In
general, if you combine subgroups with different means, then the
variance will become larger. In the above example, when the subgroups
are analyzed separately, then the variance is influenced only by the
man-man differences and the woman-woman differences. If the two groups
are combined, however, then the man-women differences enter into the
variance also.
If you have k groups whose sizes, means, and variances are
n_i, m_i, and v_i, i = 1,...,k, then:
1. the total size is N = sum n_i;
2. the total mean is M = sum n_i*m_i / N;
3. the total variance is V = sum n_i*v_i / N + sum n_i*(m_i-M)^2 / N.
Thanks for the fast reply!
A follow-up question:
As I understand you, you calculate the total variance by adding the
weighted average of the variance to calculate the Within-variance (sum
n_i*v_i / N) to a weighted average of the variance between the means,
the Between variance (sum n_i*(m_i-M)^2 / N). The latter is weighted
by weighting the sums of squares. Without the weighting, I assume this
would look like:
sum (m_i-M)^2 / k
Where does the k go in your formula above? Should it perhaps be
(sum n_i*(m_i-M)^2 / N)/k = sum n_i*(m_i-M)^2 / kN
k is hidden in N. If all the n_i were equal then we could write simply
n, and N = k*n; the n's in the formula would cancel, leaving only k.
But k does not play an explicit role in the general unequal-n case.
For the more general case, intuitively, I (think I) get the logic:
Total variance is within variance plus between variance. But what is
the background for reasoning like this? If we acknowledge that the
Total Sum of Squares = Between Sum of Squares + Within Sum of Squares,
how do we get from there to the adding of the variances? It doesn't
seem trivial.
The Within sum of squares is just the weighted sum of the variances.
(Note that all the formulas assume that variances are computed using
n, not n-1, in the denominator.)
Looking forward to hearing opinions on this.
Ole
.
- References:
- Question of merging of variance
- From: Ole Dahl Rasmussen
- Re: Question of merging of variance
- From: Ray Koopman
- Re: Question of merging of variance
- From: Ole Dahl Rasmussen
- Question of merging of variance
- Prev by Date: Re: Question of merging of variance
- Next by Date: R gui install packages command
- Previous by thread: Re: Question of merging of variance
- Next by thread: New online course: Applying the Bootstrap and Permutation Tests
- Index(es):
Relevant Pages
|
Loading