## Why does Hardy Weinberg work

• A simple proof of Hardy-Weinberg.

• Hardy-Weinberg with more than two alleles.

The Hardy-Weinberg equation is one of the most basic expectations we have in population genetics. It is very likely that you were already familiar with the Hardy-Weinberg equation before you picked up this book. But where does Hardy-Weinberg actually come from? What is the logic behind it? Let's develop a simple proof that Hardy-Weinberg is actually true. This will also be our first real foray into the type of algebraic argument that much of population genetics is built on. Given that you start out knowing the conclusion of the Hardy-Weinberg tale, this gives you the opportunity to focus on the style in which it is told. Algebraic or quantitative arguments are a central part of the language and vocabulary of population genetics, so part of the task of learning population genetics is becoming accustomed to this mode of discourse.

We would like to prove that p2 + 2pq + q2 = 1 accurately predicts genotype frequencies given the values of allele frequencies. Let's start off by making some explicit assumptions to bound the problem. The assumptions, in no particular order, are:

1 mating is random (parents meet and mate according to their frequencies);

2 all parents have the same number of offspring (equivalent to no natural selection on fecundity);

3 all progeny are equally fit (equivalent to no natural selection on viability);

4 there is no mutation that could act to change an A to a or an a to A;

5 it is a single population that is very large;

6 there are two and only two mating types.

Now let's define the variables we will need for a case with one locus that has two alleles (A and a).

N = Population size of individuals (N diploid individuals have 2N alleles)

Allele frequencies:

p = frequency(A allele) = (total number of A alleles)/2N

q = frequency(a allele) = (total number of a alleles)/2N

Genotype frequencies:

X = frequency(AA genotype) = (total number of AA genotypes)/N

Y = frequency(Aa genotype) = (total number of Aa genotypes)/N

Z = frequency(aa genotype) = (total number of aa genotypes)/N

We do not distinguish between the heterozygotes Aa and aA and treat them as being equivalent genotypes. Therefore, we can express allele frequencies in terms of genotype frequencies by adding together the frequencies of A-containing and a-containing genotypes:

Each homozygote contains two alleles of the same type while each heterozygote contains one allele of each type so the heterozygote genotypes are each weighted by half.

With the variables defined, we can then follow allele frequencies across one generation of reproduction. The first step is to calculate the probability that parents of any two particular genotypes will mate. Since mating is assumed to be random, the chance that two genotypes will mate is just the product of their individual frequencies. As shown in Fig. 2.7, random mating can be thought of as being like gas atoms in a balloon. As with gas atoms, each genotype or gamete bumps into others at random, with the probability of a collision (or mating or union) being the product of the frequencies of the two objects a A

0 0