## Interact box Genetic drift simulated with a Markov chain model

PopGene.S2 can be used to simulate genetic drift with a Markov chain. Launch PopGene.S2 and click on the Drift menu and then select Markov Process. This simulation module requires that you enter parameter values one at a time, since values of some parameters affect the values that other parameters can take.

Step 1 Start by entering 2 under Population size in the upper left corner of the simulation window. This means that there are two diploid individuals or four alleles in each population. Then click OK. The fields for the Transition matrix will now contain probabilities, with the current state given in the left column and potential future states given in the row across the top of the matrix. The Probabilities vector will also have a column of spaces appropriate for a diploid population of the size specified. What does each of the numbers in the Transition matrix mean in biological terms? Step 2 Next, select the initial allele frequency in individual populations using the pop-out menu above the Transition matrix. Each of the allele frequency values in the menu corresponds to an integer number of copies of one allele in a population. For example, a frequency of 0.2500 means one of the four alleles in the population is A. From the pop-out menu select 0.5000 for the initial allele frequency and then click the OK button just below the menu. Step 3 The histogram in the bottom right of the simulation window will now show the frequency of the A allele in many replicate populations. Using the Generations to run field you can set the number of generations that elapse between each view of the histogram as well as the Transition matrix and Probabilities vector. Enter 1 in the Generations to run field to be able to track changes each generation. Click the Start button once to simulate genetic drift in many populations over a single generation. Why did the transition matrix remain constant? The histogram and the Probabilities vector both changed. They give different views of exactly the same information: the proportion of populations out of many replicate populations with a given state (the number of A alleles). Step 4 Now press the Start button repeatedly, taking time at each generation of the simulation to view the histogram and the Probabilities vector. What is the ultimate fate of allele frequencies in all of the replicate populations? How many generations elapse to reach this equilibrium? (see generation counter under the Start button)

Reset the simulation using the Cancel/Restart button under Generations to run. Try the Markov model with population sizes of 4, 20, 50, and 100 with initial allele frequencies of 0.5 and compare the times to fixation and loss with those obtained in Interact box 3.1. Increase the value in the Generations to run field to 10 or 20 so that the simulations will progress more rapidly to equilibrium. Also try population sizes of 4, 20, 50, and 100 with initial allele frequencies of 0.1 and 0.9.

When you run the Markov model more than once for a given set of conditions what happens? Why? In contrast, what happens when the simulations in Interact Box 3.1 are rerun with the same initial conditions? Why?

Markov chains are convenient to model genetic drift because the frequency of populations in a given allelic state depends only on the frequencies in the previous generation (a quality called the Markov property). Table 3.2 can be used as a matrix of transition probabilities for any one generation of genetic drift, giving the frequency populations in each allelic state based on the transition probabilities for the number of alleles sampled and the frequencies of populations in each allelic state in the previous generation. Although a population of one diploid individual is not very interesting in biological terms, it is a convenient case to study mathematically. Using techniques of matrix algebra to determine eigenvalues for the matrix represented by Table 3.2 (see Roughgarden 1996 for a fuller explanation), its is possible to show that the rate at which genetic variation is lost from the collection of many populations is

## Post a comment