W X y v

where k represents a distance class (e.g. all populations two distance units apart) so that Wj equals one if the distance between location i and j equals k, and zero otherwise. Within a distance class k, n is the number of populations, y is the value of a genetic variable such as allele frequency for location i or j, yis the mean allele frequency for all populations, and Wk is the sum of the weights w^ or 2nk. The numerator is larger when pairs of populations have similar allele frequencies that both show a large difference from the mean allele frequency.

The formula for Moran's I might be a bit daunting at first, but the results it produces are easy to understand and interpret biologically. Like correlations in general, Moran's I takes on values from -1 to +1 when calculated with a large number of samples. A positive value of I means that that allele frequencies between pairs of locations are similar on average while a negative value means that allele frequencies between pairs of locations tend to differ on average. A value of zero indicates that differences in subpopulation allele frequencies are not related to the distance between locations or that genetic variation is randomly distributed in space. The spatial locations of genotypes like those shown in Fig. 4.3 are the perfect situation to use Moran's I (see Fig. 4.4).

(continued)

n alleles as is required for genetic adaptation to local environments under natural selection, for example.

It is worth noting that there are some important biological distinctions between gene flow and migration. Migration is simply the movement of individuals from one place to another. As such, migration may or may not result in gene flow. Gene flow requires that migrating individuals successfully contribute alleles to the mating pool of populations they join or visit. Thus, migration alone does not necessarily result in gene flow. Similarly, gene flow can also occur without migration of individual organisms. Plants are a prime example, with gene flow that takes place via movement of pollen grains (male gametes) but individuals themselves cannot migrate except as seeds. Gene flow can also occur without easily detected migration of individuals, such as cases where individuals move briefly to mate and then return to their original geographic locations. To confuse matters, the variable m (for migration rate) is almost universally used to indicate the rate of gene flow in models of population structure. Even though models do not normally make the distinction, it is wise to remember the biological differences between the processes of migration and gene flow in actual populations.

This chapter is devoted to expectations for allele and genotype frequencies in subdivided populations. The next section will cover so-called direct measures of gene flow that can be used in natural populations to determine the extent of population subdivision based on patterns of parentage determined with genetic markers. Then in the third section, we will return to the fixation index (or F) from Chapter 2 and extend it for the case of structured populations in order to serve as a measure of population subdivision. The fourth section will consider how genotype frequencies are impacted by population structure. The fifth section will return to fixation index estimates to show how they can be compared with an idealized population model to arrive at an indirect measure of historical gene flow. The final section of the chapter incorporates population subdivision into coalescent models.

4.2 Direct measures of gene flow

• Genetic-marker-based parentage analysis.

This section of the chapter will introduce and explain the use of molecular genetic markers to identify the unknown parent or parents of a sample of progeny or juveniles and thereby describe the patterns of mating that took place among the parents. Parentage analyses are considered direct measures of gene flow since they reveal and measure the pattern of gamete movement at the scale over which the candidate parents are sampled. Parentage analyses are also commonly used to test hypotheses about what factors influence patterns of mating among individuals. For example, animal parentage studies can test for correlations between mating success and phenotypes or behaviors. Parentage analysis is most often performed in the case where one parent is known and the other parent is unknown and could potentially be any one of a number of individuals or candidate parents. Genetic analyses that attempt to identify unknown fathers or unknown mothers from a population of candidate parents are called paternity analysis or maternity analysis, respectively (see Meagher 1986; Dow & Ashley 1996; Devlin & Ellstrand 1990; reviewed by Jones & Ardren 2003). Although not detailed here, it is also possible to attempt to infer both unknown parents within a population of candidate parents to estimate the minimum number of parents that contributed to a group of progeny (see Jones & Arden 2003). This section will review some of the basic concepts required to understand the methods and results of parentage analyses by means of an example paternity analysis. One focus in particular will be the distinction between identifying the true parent of an offspring and identifying a candidate parent that appears to be the true parent due to chance.

To understand the steps carried out in parentage analysis, let's work through an example based on genotype data from the tropical tree Corythophora alta, a member of the Brazil nut family (Fig. 4.5). All

Figure 4.4 (opposite) Moran's I for simulated populations like those in Fig. 4.3. To estimate Moran's I, the 100 x 100 grid was simulated for 200 generations and was then divided into square subpopulations of 10 x 10 individuals. The frequency of the A allele within each subpopulation is y and the mean allele frequency over all subpopulations is y in equation 4.1. The distance classes are the number of subpopulations that separate pairs of subpopulations. As expected, the simulations with strong isolation by distance (3 x 3 mating neighborhood) show correlated allele frequencies in subpopulations that are close together. However, the simulations with panmixia (99 x 99 mating neighborhood) show no such spatial correlation of allele frequency. The fluctuation of I at the largest distances classes in both figures is random variation due to very small numbers of individuals compared. Each line is based on an independent simulation of the 100 x 100 population.

300 m

200 m

0 0

Post a comment