Sequence 3 ...CATGGATCTT.



Sequence 1 Sequence 2 Sequence 3




Sequence 1 CATgGATCTg

Sequence 2 gATGGATaTa

Sequence 3 CATGGAcCTT

Figure 5.8 Patterns of mutational change in DNA sequences under the infinite sites (a) and finite sites (b) models. Base-pair states created by a mutation are in blue lower-case letters. In the infinite sites model sequences that are identical in state at the same site are identical by descent because mutations only occur once at each site. In contrast, the finite sites model shows how multiple mutations at the same site act to obscure the history of identity based on comparisons of site differences among DNA sequences. The ellipses (...) that surround the sequences in (a) indicate that each sequence has infinitely many sites of which only 10 are displayed.

of sequence 1 mutates from G to C, no more mutations can take place at that site. The sites where mutations took place can therefore all be distinguished in alignment of the sequences since each site only experiences a mutation once. Although other processes such as genetic drift and natural selection may influence the frequency of the sequences, we can conclude that sequences sharing the same base at a site are identical by descent.

Although no DNA sequence is infinite, the infinite sites model is a reasonable approximation if not too much time has passed since sequences shared a common ancestor. If mutation occurs randomly and with equal probability at each site, then any single site has a small chance of experiencing a mutation twice (e.g. the rate of mutation per site squared is small). Over a relatively short period of time, say 1000s of generations, only a few mutations are likely to occur and so it is unlikely that one site mutates more than once.

However, actual DNA sequences are finite and the time period for mutations to occur can be very long, so a mutation model taking these facts into account is useful. The finite sites model is used for DNA sequences of a finite length. It is similar to the infinite sites model except that now the number of sites is finite and each site can experience a mutation more than once. Multiple mutations have the potential to obscure past mutational events as shown in Fig. 5.8b. For example, two sequences are either identical or different at each site even though a site where they differ may have mutated more than once in the past. The fourth site in sequence 1 is such a case. Although there have been two mutations at that site, the second mutation leads to the same nucleotide that was originally in that position. However, in the alignment of all three sequences the fourth site is identical and it is not possible to detect the two mutation events that occurred for sequence 1. Consider a similar example of site 7 in sequence 3 and what happens when we compare pairs of sequences. Sequences 2 and 3 differ by four sites (1, 7, 8, and 10) but there are actually five mutational events that separated them in the past. Sequences 1 and 3 differ at three sites (4, 7, and 10) but there are actually five mutation events separating them. Thus, multiple mutational changes at the same site work to obscure the complete history of mutational events that distinguish DNA sequences.

The possibility of multiple mutational changes at the same site, often called multiple hits, leads to saturation of mutational changes over time as mutations

Infinite sites model A model for the process of mutation acting on infinitely long DNA sequences where each mutation occurs at a different position along the DNA sequence and the same position cannot experience a mutation more than once. Finite sites model A model for the process of mutation acting on DNA sequences of finite length so that the same site may experience a mutation more than once.

occur more times at the same sites. Saturation can be "corrected" using nucleotide substitution models that estimate and adjust for multiple mutations at the same site to estimate the "true" number of events that separate two sequences. One such correction called the Jukes-Cantor model is covered in Chapter 8.

One way to understand the impacts of multiple hits is to imagine a situation similar to the beakers containing micro-centrifuge tubes in Chapter 3. Now the beakers contain a very large number of nucleotides (A, C, T, and G) at equal frequencies. Imagine composing two DNA sequences by drawing nucleotides from the beaker. The chance that a given nucleotide is sampled at random is 25%. Therefore, given one random DNA sequence there is a 25% chance that another random DNA sequence shares an identical base pair at the same site. Therefore, DNA sequences that have experienced many mutations at the same site are expected to be identical for 25% of their base pairs. Therefore, when there is the possibility of multiple hits, identity in state is not a perfect indicator of identity by descent.

5.4 The influence of mutation on allele frequency and autozygosity

• Irreversible and bi-directional mutation models.

• The parallels between the processes of mutation and gene flow.

• Expected autozygosity at equilibrium under mutation and genetic drift.

• Expected heterozygosity and the biological interpretation of 0.

In developing expectations for allele and genotype frequencies to this point, all processes served only to shape existing genetic variation. To understand the consequences of mutation requires models that predict allele and genotype under the constant input of genetic variation by mutation. This section presents three models for the process of mutation. The first two models are related and ask how recurrent mutation is expected to change allele frequencies over time in a population. The third model predicts genotype frequencies when genetic drift and mutation are both operating, showing how the combination of these processes influences autozygosity in a population.

Let's develop two simple models to predict the impact of constant mutation on allele frequencies (sometimes called mutation pressure) in a single panmictic population that is very large. Both models will focus only on the process of mutation and leave out other processes such as genetic drift or natural selection. Consider one locus with two alleles, A and a, where the frequency of A is represented by p and the frequency of a is represented by q. For the first model, assume that mutation operates to change A alleles into a alleles but that a alleles cannot mutate into A alleles. This is called the irreversible mutation model. The chance that mutation changes the state of each A allele every generation is symbolized by || (pronounced "mu"). The frequency of the A allele after one generation of mutation is then pi+1 = p£(1 -|) (5.19)

where the (1 - |) term represents the proportion of A alleles that do not mutate to a alleles at time t. As long as | is not zero, then the frequency of A alleles will decline over time because 1 - | is less than one. This also must mean that the proportion of the a alleles increases by | each generation. If the mutation rate is constant over time, then the allele frequency after an arbitrary number of generations is

where p0 is the initial allele frequency and t is the number of generations that have elapsed.

With irreversible mutation, eventually all A alleles will be transformed into a alleles by mutation since there is no process that replaces A alleles in the population. Figure 5.9 shows the expected frequencies of the A allele over time starting at five different initial allele frequencies when the mutation rate is | = 1 x 10-5 or 0.00001. Notice that the time scale to reduce the frequency of the A allele is very long. In this example, the equilibrium allele frequency of p = 0 has not been reached even after 100,000 generations. In fact, it takes 69,310 generations to halve the frequency of A with this mutation rate (the halving time is determined by setting (1 - |)t = 1)

Irreversible mutation For a locus with two alleles, a process of mutation that changes A alleles into a alleles but does not change a alleles to A alleles.

Mutation pressure The constant occurrence of mutations that add or alter allelic states in a population.

Reversible or bi-directional mutation For a locus with two alleles, a process of mutation that changes A alleles into a alleles and also changes a alleles to A alleles.

0 0

Post a comment