GTR + r

. Null model

Null model

not rejected


Figure 8.10 The hierarchy of nucleotide-substitution models that can be used to correct apparent divergence between DNA sequences to better estimate the actual number of substitutions that have occurred. The Jukes-Cantor (JC) model is the simplest and assumes that there is just one rate of substitution that applies to all nucleotide changes and is constant among nucleotide sites. Other nucleotide substitution models include an increasing number of parameters to represent more features of DNA sequence evolution, in particular variable rates of substitution among various categories of nucleotides. If nucleotide-substitution rates are variable among different sites, this variation can be modeled by a gamma distribution indicated by the Greek letter r. Nucleotide-substitution models: JC, Jukes-Cantor (Jukes & Cantor 1969); F81, Felsenstein 81 (Felsenstein 1981); K80, Kimura 80 (Kimura 1980); HKY, Hasegawa-Kishino-Yano (Hasegawa et al. 1985); SYM, symmetrical model (Zharkikh 1994); GTR, general time reversible (Rodriguez et al. 1990). Figure after Posada and Crandall (1998).

DNA polymorphism

Variable DNA sequences at one locus within a species represent different alleles that are present in the population. Since DNA sequences are composed of many nucleotide sites, defining alleles is somewhat more complex than if alleles are discrete (i.e. A or a). Imagine obtaining a sample of n individuals from a population and determining the DNA sequence of L nucleotides for one gene or genomic region for each individual (see Tajima 1993b). For simplicity consider each individual as haploid or homozygous. The first step would be to construct a multiple sequence alignment so that the homologous nucleotide sites for each sequence are all lined up in the same columns (Fig. 8.11). With such a multiple sequence alignment there are two commonly used measures that characterize the pattern of DNA polymorphism in a sample of DNA sequences from a single species.

One measure of DNA polymorphism is the number of segregating sites, S. A segregating site is any of the L nucleotide sites that maintains two or more nucleotides within the population, such as sites 2, 6, and 8 in Fig. 8.11. The total number of segregating sites is S and can be expressed as the number of segregating sites per nucleotide site, pS, by dividing the number of segregating sites by the total number of sites:

0 0

Post a comment