Interact box Estimating n and Sfrom DNA sequence data

The number of segregating sites (S) and the nucleotide diversity (n) can be estimated in PopGene.S2. First, you will need to download a file of DNA sequence data from the GenBank web page onto a computer that also has a copy of PopGene.S2. The text web page gives step-by-step instructions to obtain DNA sequences for the mitochondrial cytochrome b gene in a sample of 30 African sable antelope (Pitra et al. 2002).

Once you have downloaded the DNA sequence data file from Genbank, open PopGene.S2 and select Molecular Population Genetics in the main menu. In the dialog window that appears, click on the Open File button and then use the file dialog to find and open the downloaded data file. The sequences will be displayed in the top window. Verify that PopGene.S2 found 30 sequences in the file and that the longest sequence was 343 base pairs. The first step is to create a multiple sequence alignment for the 30 DNA sequences by pressing the Align Sequences button (pressing the button will cause a window to open temporarily). Once the sequences are aligned, the number of segregating sites, the number of gaps, and the nucleotide diversity will then be estimated. Click on the Pairwise Nucleotide Diversity button to see the nucleotide diversity for all possible pairs of DNA sequences. The Nucleotide site distribution button gives a list of the frequencies of each base pair found at each of the sites in the multiple sequence alignment. Use the Save Aligned Sequences button to save a file with the multiple sequence alignment. Then open this file in a text editor such as Notepad to view the aligned sequences.

rate. A corollary of this prediction is that the expected number of generations between substitutions is the reciprocal of the mutation rate. For example, if the mutation rate is 1 x 10-5base pairs replicated in error per generation then we expect to wait an average of 105 generations to see one mutation in a single gene copy. Thus, the neutral theory provides a null model for the rate of divergence of homologous genes or genome regions between isolated populations or species called the molecular clock hypothesis. This section will first present data that demonstrate the molecular clock and then show how the molecular clock hypothesis can be used to date evolutionary events based on DNA divergence. The section will conclude by showing why divergence may appear to decrease over time as many substitutions accumulate and how divergence estimates can be corrected using models of the mutation process.

As the clock metaphor suggests, the molecular clock hypothesis predicts that divergence accumulates with uniform regularity over time, just like the ticking of a clock. This means that the divergence between two species should increase as the time since they shared a common ancestor recedes further into the past. Such a pattern was originally observed for hemoglobin proteins by Zuckerkandl and Pauling (1962, 1965), who first hypothesized a molecular clock. A classic example of the molecular clock is the increase of divergence with increasing time seen in the NS gene of the human influenza A virus (Fig. 8.12). Buonagurio et al. (1986) used influenza virus isolated from samples originally taken between 1933 and 1986. They then estimated the number of nucleotide substitutions, or the p-distance, between each sequence and the inferred ancestral sequence. The linear increase in divergence with time is the pattern expected by the molecular clock hypothesis.

Another important early advance for the molecular clock came when Richard Dickerson (1971) compared rates of substitution in proteins from cytochrome c, hemoglobin, and fibrinopeptide genes and observed that the average rates of change were very different for the three proteins (Fig. 8.13). Based on knowledge of the function of the proteins at the time, Dickerson argued that the rate of molecular evolution was faster when fewer sites were subject to functional constraints on amino acid changes. That

Molecular clock hypothesis The neutral theory prediction that divergence should occur at a constant rate over time so that the degree of molecular divergence between species is proportional to their time of separation. Synonymous with rate constancy or rate homogeneity.

0 0

Post a comment