Re: Hardy-Weinberg law

From: Guy Hoelzer (hoelzer_at_unr.edu)
Date: 07/01/04


Date: Thu, 1 Jul 2004 15:46:22 +0000 (UTC)

Bob (and Bill),

in article cbumkp$1r7u$1@darwin.ediacara.org, Anon. at
bob.ohara@SOD.OFF.Spammers.helsinki.fi wrote on 6/30/04 8:35 AM:

>>>> :-WLH
>>>> The discussion was of a Hardy-Weinberg equilibrium and your original
>>>> statement "in finite populations, you get an excess of homozygotes", I
>>>> took to refer to an excess over that predicted by HW. Apparently you
>>>> were referring to something else that has nothing to do with
>>>> Hardy-Weinberg equilibrium. This threw me off and I suspect it may
>>>> have for some others also. Guy Hoelzer also responds to this in an
>>>> above thread.
>>>> Yes, if the population is small, many more loci will reach fixation
>>>> (homozygote) to one allele or the other and there is less genetic
>>>> diversity. This has nothing to do with Hardy-Weinberg. Hardy-Weinbery
>>>> only speaks to loci where there are still two alleles present in the
>>>> population at some frequency and it predicts what the distribution
>>>> will be. It originally sounded like you were saying was that if there
>>>> is a frequency of p=.5 for allele 'A' and q=.5 for allele 'a', that in
>>>> a large population you would expect the distribution to be a
>>>> Hardy-Weinberg AA=.25, Aa = .50, aa=.25 but for some small population
>>>> size, "you" might expect it show an "excess of homozygotes", such as
>>>> AA=.30, Aa=.40, aa=.30. I now am not sure what you meant?
>>>
>>> :-BOH
>>> This is what I meant. In a finite population, the expected number of
>>> heterozygotes is less than predicted from HWE. Gale goes through the
>>> calculations in his textbook (my edition is from the early 80s). Most
>>> textbooks use a deterministic calculation, but get the same result.
>>> Either way, it all goes back to Wright.
>>>
>>> Bob
>>>
>>
>> Can you give me a formula that for a particular locus, given N q and
>> p for a randomly mating population, gives the expected frequency of
>> AA,Aa, and aa similiar to HWE above but allows one to see what
>> distribution is expected for a particular finite population size(N)? I
>> don't recall seeing such a formula?

I was looking forward to Bob's answer, because his claim seemed so clearly
false and that would be out of character.
 
> I suspect you have seen it, but not expressed in this way!
>
> The expected frequency of heterozygotes is 2pq(1-1/2N), the expected
> frequency of the AA homozygote is p^2 + p(1-p)/2N. It turns out that
> 1/2N is just the inbreeding coefficient, of course.

Indeed it is; and now I see the confusion. This set of equations is used to
predict genotype frequencies in the next generation after a round of random
mating. Writing the equations this way is a little misleading, because the
role of the factors including the "N" term is to account for changes in "p"
and "q" resulting from drift in a finite population. So, in effect, "p" and
"q" drift, eventually leading to fixation of one or the other allele. Under
the neutral model, we expect a particular time to fixation (contingent upon
p, q, and N; and with a huge variance), and the equations above predict the
proportional change in genotype frequencies as if inbreeding in the randomly
mating finite population pushed the population deterministically and without
variance to fixation in exactly the expected number of generations. Of
course, drift does not work that way. Instead, it causes the gene pool to
take a random walk to fixation, and the pure combinatorial predictions of
the infinite population size version of the HW model will be unbiased at
every step along the way.
 
> My reference is J. S. Gale (1980) Population Genetics (Tertiary Level
> Biology), and I forgot to bring it in to work today. But the proof of
> the heterozygote deficit follwos from calculating E[2p(1-p)] = 2E[p] -
> 2E[p^2].
>
> E[p]=p and E[p^2] is calculated from the definition of a variance:
>
> Var[p] = E[p^2] - E^2[p]
> and
> Var[p] = p(1-p)/2N, so E[p^2] = p(1-p)/2N - p^2.
>
> Plugging this into E[2p(1-p)] we get
>
> 2E[p] - 2E[p^2] = 2p - 2( p(1-p)/2N - p^2 )
> = 2(p-p^2) - 2p(1-p)/2N
> = 2p(1-p)(1-1/2N)
>
> which is the result we want.

It is indeed a valid equation, though prone to a great deal of stochastic
error, for the mean field expectation of genotype frequency changes across
generations in a finite population.

Cheers,

Guy