## Models of genetic drift

• An introduction to the binomial distribution and Markov chains.

• The diffusion approximation of genetic drift.

The last section demonstrated the phenomenon of genetic drift caused by sampling error and drew some general conclusions based on the results of computer simulations. Building on this foundation, this section will introduce three probability models that can be used to confirm these and other general properties of the process of genetic drift. The first model, the binomial distribution, will be used to show that the magnitude of genetic drift from one generation to the next depends on allele frequencies in the population. The second model, the Markov chain, will be used to show the rate of change of allele frequencies under genetic drift. The third model, a continuous time approximation to the Markov chain, will be introduced to show how genetic drift can be modeled as the diffusion of particles.

### The binomial probability distribution

To develop the first model, let's return to the microcentrifuge tube populations from the last section. When sampling a tube from the beaker there are only two outcomes, a blue tube or a clear tube, which are used to represent the two alleles at one locus. The tubes are a specific case of a Bernoulli random variable (sometimes called a binomial random variable), or a variable representing a trial or sample that can have only two outcomes. Coin flips with either heads or tails outcomes are another example of a Bernoulli random variable. What we often want to know is, what are the chances of obtaining a given set of Bernoulli outcomes? For example, what are the chances of obtaining four heads when flipping a coin four times? In our micro-centrifuge tube samples, what are the chances of one of the possible outcomes (20 blue, 19, blue, 18 blue

0 blue) when sampling 20 tubes from the beaker? Answers to these types of questions require a means to estimate a probability distribution.

The binomial (literally, "two names") formula defines the probability distribution for the sum of N independent samples of a Bernoulli variable:

piq2N i where

The binomial formula gives the probability of sampling i A alleles in a sample of 2N from a population where the A allele has a frequency of p and the alternate a allele is at a frequency of q. The pi and q2N-i terms estimate the probability of observing i and 2N - i independent events each with probability p and q, respectively. The term

0 0