Tree evaluation

A second type of evaluation assesses the robustness of the tree topology itself. Decay analysis (Bremer 1988) determines the number of steps required to collapse nodes. To perform a decay analysis, we increase the tree length by successive steps. This shows how many trees exist that are one or more steps longer than the most parsimonious tree (MPT); if there are a number of trees of similar length to the MPT, but with different topologies, we might place less confidence in the MPT. A decay analysis will also reveal how many added steps it takes to collapse individual nodes, and which specific characters influence those nodes. This in turn allows us to test whether a set of functionally correlated characters influences a particular node. TreeRot (Sorenson 1999) is a computer program that uses PAUP* to perform decay analysis, although it is possible (albeit time consuming) to do it manually by successively adding one to the tree length in the search parameters.

Hennigian analysis of a data set may produce more than one tree with the same number of fewest possible steps, a phenomenon known as multiple most parsimonious trees. With a large number of taxa and characters, especially if they contain large amounts of homoplasy or missing data, parsimony frequently generates multiple MPTs. In these cases, it is not possible to designate a single preferred tree; however it is possible to generate a variety of consensus trees to delineate similarities in topologies of different MPTs (Adams 1972; Wilkinson 1994).

There are different techniques to build consensus trees that combine the topo-logical information from two or more trees to create a new summary tree. Strict consensus trees only include monophyletic groups that appear in all of the input trees, and thus usually result in a number of polytomies (Sokal and Rohlf 1981).

Adams consensus trees are slightly more inclusive, purporting to give the most resolution between a set of trees (Adams 1972). These may, however, produce groupings not found in any of the input trees. Majority-rule consensus trees are perhaps the most lenient and frequently used summary tree technique (Margush and McMorris 1981). They are created by building a new tree that contains all monophyletic groups that are supported by a majority of the set of input trees. This means that they may be logically inconsistent with the information produced by one of the MPTs. Consensus techniques are useful as visual summaries of points of agreement or logical consistency between MPTs, but they are not phylogenies, and they are not equivalent to what is produced by phylogenetic analysis of a data matrix. Occasionally, the consensus tree will be one of the MPTs in which case it is a summary tree as well as a phylogeny. Usually, the creation of consensus trees results in the creation of a number of polytomies, or nodes in which the relationships between taxa are unresolved. These are known as soft polytomies when they are created by lack of resolution due to insufficient data or methods. Hard polytomies are actual speciation events, in which a population divides simultaneously into three or more descendent species, or in which two sister species hybridize, forming a third species (Brooks and McLennan 2002). These are impossible to distinguish from soft polytomies with a phylogeny alone.

Bootstrap and Jackknife analyses attempt to estimate the degree of sampling error in the original data set, by attempting to place confidence intervals on phylogenies by making inferences about the variability in the data set. Bootstrapping (Felsenstein 1985a) samples the data set with replacement, that is, it allows for some characters to be sampled more than once, and some not to be included at all and constructs a new data set with the same number of characters. PAUP* (Swofford 1998) and other programs construct a series of these, and build a majority-rules consensus tree that summarizes the results of the resampled data. The number of times a particular group is included in the set of trees that form the consensus is an estimate of the reality of that group, in that the process has measured the amount of variation between the newly sampled data sets. The bootstrap, then, is a measure of the confidence we can place in each node of the tree, like the decay index. Felsenstein (1985a) suggested that a bootstrap value of 95% or greater offers statistically significant support for a clade.

There are a number of caveats of which we must be aware before placing too much faith in the numbers generated by this analysis. Since the bootstrap measures the variation in one set of data, it does not allow us to choose between trees built from different data sets. Felsenstein (1985a) stipulated that a bootstrap assumes characters are independent and equally distributed. He was explicit that the bootstrap indicates repeatability of an analysis given the data, and should not imply the phylogenetic accuracy of a tree (Soltis and Soltis 2003). It will also be affected by biases such as long-branch attraction (Swofford 1998).

The jack-knife is another mode of evaluation, similar to the bootstrap in that it estimates variability in the data set. It is a procedure to resample data by deleting a certain number of characters [either half (Felsenstein 1985a) or another fraction (Farris et al. 1996)] and resampling the data without allowing characters to be duplicated. Characters are randomly and independently deleted from the original matrix to create a new "resampled" matrix, and like the bootstrap, many matrices are produced and the results are compiled into a consensus tree. For a review of the different kinds of jackknife resampling (delete-half, parsimony, weighted) and the assumptions and problems with each see Efron (1979), Wu (1986), and Farris et al. (1996).

0 0

Post a comment