## Problem box Constructing a transition probability matrix

Understanding Markov chains is easier with some practice constructing them. Try constructing the transition matrix for a diploid population size of two (identical to the micro-centrifuge tube sampling experiment with a sample size of four tubes). Similar to Table 3.2, set up a matrix where columns represent the initial allelic state and the terms in rows add together to determine the proportion of populations with a given state one generation later. Then use the binomial formula to calculate the chance that a single population makes each of the allelic state transitions. Indicate the frequency of populations in the initial generation (t = 0) with a given allelic state by the variable Pt=0(x) where x is the number of alleles.

Think about the problem before carrying out any calculations. It is less work than it may appear at first. Two columns have probabilities of either zero or one. Two other columns have the same probabilities but in reversed order.

Biological populations that closely mimic the ensemble population of Markov chain models are relatively easy to construct and maintain given the right choice of organism and some persistent effort. In fact, the first studies of allele frequencies in many identical replicate biological populations were carried out in the 1950s (for example Kerr & Wright 1954; Wright & Kerr 1954). The organisms of choice were fruit flies (Drosophila species) since many individuals can be raised in a small space, generation times are short, and a population can be unambiguously defined as one bottle (containing food) of flies. To rear flies, males and females are put together in a bottle and allowed to mate. The adults are removed from the bottle after the females have time to lay eggs on the food. The larvae that emerge from these eggs and become mature flies can then be sampled to found a new generation in a fresh bottle. Figure 3.11a shows the results of one such classic experiment that followed allele frequencies in 107 replicate populations for 19 generations (Buri 1956). All of the populations were constructed to have initial allele frequencies of p = q = 0.5 at a diallelic locus (alleles were wild type and bw75). The distribution of allele frequencies in the 107 populations quickly spread out from the initial frequency. Around the fifth generation a few populations have reached either fixation or loss for the bw75 allele. As more generations elapse, the distribution becomes flatter with more and more populations reaching fixation and loss.

The overall shape of the distribution of population allele frequencies for the fly populations closely matches the expected population frequencies according to a Markov chain model of genetic drift for a population of 16 individuals shown in Fig. 3.11b. In particular, the fly populations and the Markov chain model both show a rapid spread from the initial frequency and an equal number of populations that reach fixation or loss. However, notice that the fly populations have a less even distribution of allele frequencies due to the relatively small number of populations compared to the smoothly continuous

0 0