Interact box Build your own coalescent genealogies

Building a few coalescent trees can help you to understand how the exponential distribution is put into practice to estimate coalescence times as well as give you a better sense the random nature of the coalescence process. You can use a Microsoft Excel spreadsheet at the textbook website that has been constructed to calculate the quantities necessary to build a coalescent genealogy. The spreadsheet contains the cumulative exponential distributions for a time interval passing without experiencing a coalescent event (see equation 3.74) for up to six lineages. To determine a coalescence time for a given number of lineages, k, a random number between zero and one is k(k - 1)

picked and then compared with the distribution when —^— lineages can coalesce. The time interval on the distribution that matches the random number is taken as the coalescence time.

Step 1 Open the spreadsheet and look over all the quantities calculated. Click on cells to view the formulas used, especially the cumulative probability of coalescence for each k. This will help you understand how the equations in this section of the chapter are used in practice. View and compare the cumulative probability distributions graphed for k = 6 and k = 2.

Step 2 Press the recalculate key(s) to generate new sets of random numbers (see Excel help if necessary). Watch the waiting times until coalescence change. What is the average time to coalescence for each value of k? How variable are the coalescence times you observe in the spreadsheet when changing the random numbers with the recalculate key? Step 3 Draw a coalescence tree using the coalescence times found in the spreadsheet (do not recalculate until step 6 is complete). Along the bottom of a blank sheet of paper, draw six evenly spaced dots to represent six lineages. Starting at the top of the random number table, pick two lineages which will experience the first coalescent event. Label the two left-most dots with these lineage numbers. Then, using a ruler, draw two parallel vertical lines as long as the waiting time to coalescence (e.g. if the time is 0.5, draw lines that are 0.5 cm). Connect these vertical lines with a horizontal line. Assign the lineage number of one of the coalesced lineages to the pair's single ancestor, at the horizontal line. Record the other lineage number on a list of lineages no longer present in the population (skip over these numbers if they appear again in the random number table). There are now k - 1 lineages.

Step 4 Use the random-number table to get the next pair of lineages that coalesce. Remember that the single ancestor of lineage pairs that have already coalesced will eventually coalesce with one of the remaining lineages. If one of the lineages of this random pair matches a previous pair's ancestor, begin at the horizontal line indicating that pair's coalescence, and draw a vertical line toward the top of the paper that is the length of the coalescence time for the number of lineages remaining. Draw a vertical line from lineage n to an equal height and connect the two vertical lines with a horizontal line (the line from lineage n will be as long as the sum of all coalescence times to that point). If neither number matches a previous pair's ancestor, draw the branches as in step 3, beginning at the baseline, but this time adding this pair's particular coalescence time to the sum of previous coalescence times to find the vertical branch height.

Step 5 Repeat the process in step 4 until all lineages have coalesced.

Step 6 Then add together all of the times to coalescence to obtain the total height of the coalescent tree and sum the height of all of the branches to obtain the total branch length of the tree. How do these compare with the average values for a sample of six lineages?

Step 7 Press the recalculate key combination to obtain another set of coalescence times and repeat steps 3 through 5 to create another coalescence tree. Draw several coalescence trees to see how each differs from the others. You should obtain coalescence trees like those in Fig. 3.25. Your trees will differ from these, because the random coalescence times vary around their average, but the overall shape of your trees will be similar.

This section will conclude by considering several measures of coalescent trees useful to summarize general patterns of the coalescence process. The total time from the present to the point in the past where all k sampled lineages find their MRCA is called the height of a coalescent tree. The height of a tree for k sampled lineages is just the sum of the coalescence waiting times as coalescent events reduce the number of lineages from k to k - 1 to k - 2 down to one. The mean or expected value of the height of a coalescent tree is then

0 0