## Growing or shrinking populations

TreeToy is a java applet that simulates genealogies for neutral alleles in populations that are growing in size through time. The lineages experience mutations with a rate that depends on 0 = 4Ne ^ (called Theta0 in the simulation for 0 at time zero since Ne changes over time). The simulation displays the genealogy along with the mismatch distribution and the frequency histogram of each of the haplotypes in the population. The simulation requires values for the number of lineages in the genealogy (Sample Size), the growth rate of the population (Growth Factor) and a scaling factor for the time to coalescence of two genes in units of 1/(2^) generations (Tau).

Set the Sample Size to 30, Theta0 to 20, Growth Factor to 1, leave Tau at the default value of 8, and then click on Draw Tree. The resulting genealogy has waiting times similar to those expected under the standard coalescent model with constant population size through time. It should have a somewhat bimodal mismatch distribution (the graph at bottom left) and a wide range of haplotype frequencies (graph at bottom right labeled Freq. Spectrum). Press the Draw Tree button several times to see that there is substantial variation in these distributions even for the same model parameters. Note the general shape of the genealogies under the constant Ne model.

Now change Growth Factor to 100 and click Draw Tree. What does this value of Growth Factor mean biologically? Why does the genealogy have external long branches with few coalescence events near the present? How do the mismatch and haplotype frequency distributions change?

Note that the contrasting features of mismatch distributions and the haplotype frequency spectrum in populations that are constant or changing in size through time are generally more apparent when 0 is larger (more mutations occur) and there is a larger sample of lineages in the genealogy. The graph of theta by tau at the upper right shows the probability of mutation back through time as the genealogy approaches the MRCA. Larger values of tau represent deeper times in the past while larger values of theta represent a high probability of mutation. Simulate a genealogy with Growth Factor equal to 10 and then interpret the distribution of theta by tau.

strong directional selection, there is expected to be an excess of high-frequency haplotypes and very few rare haplotypes because most of the branch length lies in the internal branches of the genealogy.

Mismatch and haplotype frequency distributions can help identify instances of expansion or contraction in the effective population size if sequences are neutral. Alternatively, if sequences are known to come from a population with a constant effective size, then these distributions can be used to identify the action of natural selection. Several tests are available that use the haplotype frequency or mismatch distributions to evaluate the null hypothesis of constant effective population size through time using DNA sequences (Fu & Li 1993; Fu 1996, 1997; Schneider & Excoffier 1999; Mousset et al. 2004; Innan et al. 2005). It is important to note several limitations of these tests. First, recombination has the potential to impact the mismatch distribution along with population demography. Recombination events assemble novel sequence haplotypes from existing haplotypes and in doing so break up mutations that are associated due to identity by descent. Therefore, recombination obscures the history of mutations and in the extreme would lead to a uniform mismatch distribution. Second, coalescence is a stochastic process and there is an inherently large variance in times to coalescence (see Chapter 3). This leads to a large variance in the shape of mismatch distributions even when Ne is constant. Therefore, tests that utilize the mismatch distribution can only be expected to detect very large and sustained shrinkage or expansion of Ne.

8.6 Molecular evolution of loci that are not independent

• Gametic disequilibrium between neutral and selected sites influences polymorphism.

• Genetic hitch-hiking, selective sweeps, and background selection.

• Gametic disequilibrium and rates of divergence.

Up to this point in the chapter, loci or nucleotide sites have been considered completely independent entities that are not influenced by evolutionary processes at neighboring loci or nucleotide sites. This is equivalent to assuming that all alleles at all loci are in complete gametic equilibrium. Processes such as physical linkage covered in Chapter 2 may cause gametic disequilibrium among the alleles in any population. The impacts of gametic disequilibrium on new mutations is of particular interest in molecular evolution because the fate of new mutations dictates levels of polymorphism and rates of divergence. Because new mutations initially enter a population as a single copy, they must initially experience very high levels of gametic disequilibrium. A new mutation that is present in only one copy will be uniquely associated with the other alleles that just by chance occur on the same chromosome where the mutation occurred. The gametic disequilibrium experienced by new mutations has substantial consequences for neighboring sites in the genome if the new mutation is acted on by natural selection. First, let's explore changes in polymorphism caused by gametic disequilibrium between neutral nucleo-tide sites and nucleotide sites where mutations are acted on by natural selection. At the end of this section we will consider the implications for rates of divergence.

To see the consequences of gametic disequilibrium for new mutations, consider what happens when a favorable mutation arises in a population. Assume for now that the population is composed of haploid individuals that reproduce clonally so that there is no recombination. Figure 8.22 illustrates the changes to allele frequencies over time when a favorable mutation enters a population. Initially, the population contains five different haploid sequences. Each of these chromosomes bears a number of neutral mutations and each chromosome also has an intermediate frequency in the population that is the product of genetic drift. An advantageous mutation, indicated by a star on the figure, occurs by chance on one of the chromosomes. Over time, the chromosome bearing the favorable mutation will increase in frequency since it has a higher fitness and the other chromosomes will decrease in frequency. Eventually, depending on the relative fitness of the mutation, the chromosome bearing the advantageous mutation will approach fixation in the population. Because there is no recombination in this example, the advantageous mutation is only found on one type of chromosome. Thus, the advantageous mutation is

Frequency distribution

Frequency distribution

## Post a comment