Info

Under neutrality and the infinite sites model, DNA sequence polymorphism is expected to be 0 = 4NeM and divergence is expected to be k = 2T||. The test and reference locus may have different mutation rates. But note that the effective population size is constant when polymorphism is estimated for the two loci since the loci are sampled from the same species. The divergence times are also equal for the two loci since they are estimated from the same species pair. The ratio of the two divergence estimates at the test and reference loci is expected to equal the ratio of the test locus mutation rate over the reference

locus mutation rate

Mr Mr since the factor of 4N„

cancels out. The ratio of the two divergence estimates at the test and reference loci is also expected to equal

I-. Therefore, under neutrality the ratio ofpolymorph-

Mr ism estimates at the two loci as well as the ratio of the divergence estimates at the two loci should be equal since they both represent ratios of the mutation rates at the two loci. Similarly, the ratios of polymorphism over divergence for each locus are expected to be equal under neutrality. The ratios can be tested for equality using a Chi-square test.

Table 8.5b shows an idealized illustration of polymorphism and divergence estimates that would be consistent with the neutral null model of DNA sequence evolution. In this idealization, the two loci do have different mutation rates that lead to different amounts of polymorphism and divergence. However, the ratios are equal as expected if the fate of mutations is due only to genetic drift.

Table 8.5 c shows a classic example of divergence and polymorphism estimated in fruit flies to carry out the HKA test (Hudson et al. 1987). The locus tested for neutral evolution is the gene for alcohol dehydrogenase (Adh) and the reference locus is sequence upstream (5') to the coding region that does not possess an open reading frame. Polymorphism was estimated for the two genes from a sample of Drosophila melanogaster individuals and divergence for the two loci was determined with sequences from Drosophila sechellia. If the 5' flanking region is truly neutral, then the Adh data show too much polymorphism within D. melanogaster. An excess of Adh polymorphism is also indicated by the large ratio of polymorphism for the two loci within D. melanogaster compared with the ratio of divergences for the two loci. It is now widely accepted that the Adh locus in D. melanogaster exhibits an excess of polymorphism consistent with balancing selection.

Although the HKA test is ingenious, it does have some limitations and assumptions. One difficulty in practice is the ability to identify an unambiguously neutral reference locus. For example, the 5' flanking region used by Hudson et al. (1987) as a neutral reference locus very likely contains promoter sequences that are functionally constrained by natural selection. Innan (2006) described a modification to the HKA test to use the average of multiple reference loci that should help avoid misleading results caused by reference loci that do not fit test assumptions.

Implicit in the HKA test is the assumption that each of the two species used are panmictic. Population subdivision has the potential to alter levels and patterns of nucleotide polymorphism and divergence (see review by Charlesworth et al. 2003) depending on how individuals are sampled. Consider levels of polymorphism in a subdivided species where FST is greater than zero. Population subdivision causes lower polymorphism for individuals sampled within subpopulations due both to reduced effective population size that increases drift within demes and increased autozygosity resulting from a higher probability of mating within demes. In contrast, there will be larger genetic differences for individuals sampled from two different demes due to differentiation among demes that would result in high perceived levels of polymorphism. If the HKA test is carried out for a species with population structure, sampling needs to be conducted to avoid taking sequences from only one or a few demes that could lead to an erroneous conclusion of too little polymorphism compared to the neutral expectation. Ingvarsson (2004) showed how the HKA test can lead to incorrect rejection of the neutral null hypothesis when there is population subdivision. He also gives an example of a population-structure-corrected HKA test applied to organelle DNA sequence data from the plant species Silene vulgaris and Silene latifolia which both exhibit strong population structure.

MK test

The MK test is a test of the neutral model of DNA sequence divergence between two species (McDonald

& Kreitman 1991). Like the HKA test it is named for it authors, McDonald and Kreitman. The MK test is also conceptually similar to the HKA test because it too establishes expected ratios of two classes of DNA changes at a single locus under neutrality. The MK test requires DNA sequence data from a single coding gene. The sample of DNA sequences is taken from multiple individuals of a focal species to estimate polymorphism. The test also requires a DNA sequence at the same locus from another species to estimate divergence.

The neutral expectations for the MK test are given in Table 8.6. The two classes of DNA change used in the MK test are synonymous and nonsynonymous

(or replacement) changes. Nonsynonymous mutations within coding regions may alter the amino acid specified by a codon. Due to the redundancy of the genetic code, some mutations within coding regions will not change the amino acid specified by the codon and are therefore synonymous changes.

If genetic drift is the only process influencing the fate of a new mutation, levels of polymorphism and divergence within each category of DNA change should be correlated because they are both determined in part by the mutation rate. Fixed differences between species are caused by mutations that have gone to fixation, with expected divergence under neutral theory of 2T||. Nucleotide sites that have two or more nucleotides within the focal species exhibit polymorphism, with an expected level of 4Nel under neutral theory. Since synonymous and nonsynonymous mutations may occur at different rates, we can assign each category of DNA change a different rate (|N and |S). Both the ratio of nonsynonymous and synonymous fixed differences and the ratio of nonsynonymous and synonymous polymorphic sites are expected to be equal to |N/|S under neutral theory. The MK test therefore compares these two ratios for equality as a test of neutral theory. The neutral case illustration in Table 8.6b gives an example where | N < | S and where there is a higher level of polymorphism than divergence. Nonetheless, the ratios of the number of nonsynonymous over synonymous changes are constant for fixed differences and polymorphic sites as expected if both classes of mutations are neutral.

An MK test based on numbers of synonymous and nonsynonymous changes at the Adh locus that were fixed between Drosophila species or polymorphic within D. melanogaster (McDonald & Kreitman 1991) is given in Table 8.6c. Using fixed sequence differences as a reference point, fewer substitutions between

Table 8.6 Estimates of polymorphism and divergence (fixed sites) for nonsynonymous and synonymous sites

at a coding locus form the basis of the MK test. (a) Under neutrality, the number of nonsynonymous sites

divided by the number of synonymous sites is

equal to the ratio of the nonsynonymous and synonymous

mutation rates. This ratio should be constant both for nucleotide sites with fixed differences between species

and polymorphic sites within the species of interest. (b) An illustration of ideal nonsynonymous and

synonymous site changes that would be consistent with the neutral null model. (c) Data for the Adh locus

in D. melanogaster (McDonald & Kreitman 1991) show an excess of Adh nonsynonymous polymorphism

compared with that expected based on diverc

ence. (d) Data for the Hla-B locus for humans show an excess

of polymorphism and more nonsynonymous than synonymous changes, consistent with balancing selection

(Garrigan & Hedrick 2003).

0 0

Post a comment