Incompleteness of the record
From the earliest days of their subject, paleontologists have been concerned about the incompleteness of the fossil record. Charles Darwin famously wrote about the "imperfection of the geological record" in his On the Origin of Species in 1859; he clearly understood that there are numerous biological and geological reasons why every organism cannot be preserved, nor even a small sample of every species. In a classic paper in 1972, David Raup explained all the factors that make the fossil record incomplete; these can be thought of as a series of filters that stand between an organism and its final preservation as a fossil:
1 Anatomic filters: organisms are likely to be preserved only if they have hard parts, a skeleton of some kind. Entirely soft-bodied organisms, such as worms and jellyfish, are only preserved in rare cases.
2 Biological filters: behavior and population size matter. Common organisms such as rats are more likely to be fossilized than rare ones such as pandas. Rats also live for a shorter time than pandas, so more of them die, and more can become potential fossils.
3 Ecological filters: where an organism lives matters. Animals that live in shallow seas, or plants that live around lakes and rivers, are more likely to be buried under sediment than, for example, flying animals or creatures that live away from water.
4 Sedimentary filters: some environments are typically sites of deposition, and organisms are more likely to be buried there. So, a mountainside or a beach is a site of erosion, and nothing generally survives from these sites in the rock record, whereas a shallow lagoon or a lake is more typically a site of deposition.
5 Preservation filters: once the organism is buried in sediment, the chemical conditions must be right for the hard parts to survive. If acidic waters run through the sediment grains, all trace of flesh and bones or shells might be destroyed. Or if the sediment is constantly being deposited and reworked, for example in a river, any skeletal remains may be worn and damaged by physical movement.
6 Diagenetic filters: after a rock has formed, it may be buried beneath further accumulating sediment. Over thousands or millions of years, the rock may be transformed by the passage of mineralizing waters, for example, and these may either enhance the fossils, by replacing biological molecules with mineral molecules, or they may destroy the fossil.
7 Metamorphic filters: over millions of years, and the movements of tectonic plates, the fossiliferous rock might be baked or subjected to high pressure. These kinds of metamorphic processes turn mudstones into shales, limestones into marbles. The fossils may survive these terrible indignities, or they may be destroyed.
8 Vertical movement filters: nearly all fossils are in sedimentary rocks that have been buried. Burial means the rock has been covered by younger rock, and has gone down to some depth. Tectonic movements must subsequently raise the fossiliferous rock to the Earth's surface, or the fossil remains forever buried and unseen.
9 Human filters: the fossil must finally be seen and collected by a human being. Doubtless, the majority of fossils that go through the burial and uplift cycle are lost to erosion, washed away from the foot of a sea cliff or blasted by sand-carrying winds in the desert. Someone has to see the fossil, collect it and take it home. Even then, of course, the fossil has to be registered in a museum before it becomes part of collective human paleontological knowledge. Many that are collected molder in someone's bedroom before they are thrown away with the garbage.
After all this, it's a wonder any fossils survive at all!
The fact that the museums of the world contain so many millions of fossils is a testament to the hard work of paleontologists of all nations. But it also reflects the enormity of geological time and the sheer numbers of organisms that have ever existed.
Bias and adequacy_
In his 1972 paper, David Raup argued persuasively that the fossil record is not only incomplete, but also that it is biased. This means that the distribution of fossils is not random with respect to time, but that it gets worse in older and older rocks. The evidence is twofold: theoretical and observational. The theoretical evidence is persuasive. The last two or three of the filters just mentioned are time related; the older the rocks, the more substantially they will have removed fossils from the potential record. As times goes by, ancient fossiliferous deposits are ever more likely to have been metamorphosed, buried under younger rocks, subducted into the mantle or eroded. The longer a fossil sits in the rock, the more likely one of these processes is to destroy it. Further, paleontologists are familiar with this steady loss of information. If you try to collect fossils from a Miocene lagoonal deposit, the shells are abundant and beautifully preserved, and you can collect thousands in an hour or two. If you try to collect from a fossiliferous deposit from the same environment in the Cambrian, fossils may be rare, they may be distorted by metamorphism, and they may be hard to get out of the rock.
Others have argued, however, that these biases apply only at certain levels of study. Clearly, in collecting individual shells, you fill your rucksack faster at a Miocene locality than a Cambrian locality. You may also identify many more species based on those collections. But, perhaps if you step back and consider families or genera, rather than species or specimens, and you consider the fossils from whole continents rather than just one quarry, the representation may be relatively uniform. After all, you can recognize the presence of a species or genus from just a single specimen; it does not require a million specimens.
In a study in 2000, Mike Benton and colleagues suggested that the temporal bias identified by Raup might be an issue of scaling. Clearly Raup was right that fossils are steadily lost from the record in older and older rocks. But could the record be adequate nonetheless for coarser-scale studies? Benton and colleagues applied clade-stratigraphy measures (Box 3.3) to a sample of 1000 published phy-logenetic trees (see p. 129). These trees repre-
Partition ancient <->• recent
Figure 3.9 Mean scores of the stratigraphic consistency index (SCI), the relative completeness index (RCI) and the gap excess ratio (GER) for five geological time partitions of the data set of 1000 cladograms. Note that the SCI and GER indicate no change through time, while the RCI becomes worse (lower values) from the Paleozoic to Cenozoic - but the RCI depends on total geological time, and so is not a good measure for this study. Pz, cladograms with origins solely in the Paleozoic; Pz/Mz, cladograms with origins spanning the Paleozoic and Mesozoic; Mz, cladograms with origins solely in the Mesozoic; Mz/Cz, cladograms with origins spanning the Mesozoic and Cenozoic; Cz, cladograms with origins solely in the Cenozoic. (Based on Benton et al. 2000.)
sented the branching patterns of different sectors of the tree of life, some of them dating back to the Paleogene, others to the Mesozoic, and yet others to the Paleozoic. These authors divided the 1000 trees into five time bins, each of roughly 200 trees, and they assessed how well the trees matched the fossil record. Using different metrics, the trees showed nearly identical measures of agreement from the Paleozoic to the Cenozoic (Fig. 3.9). Benton and colleagues argued that this confirmed that sampling of the record was equally good (or bad) through the last 500 million years at a coarse scale. The cladograms (see p. 129) were generally drawn at coarse taxonomic levels (genera and families, not species) and a coarse time scale was used (stratigraphic stages, average duration 7 million years).
So, paleontologists could breathe a sigh of relief: their studies of the Cambrian might be just as well, or badly, supported by data as their studies of the Carboniferous or Ceno-zoic. Or could they? What exactly was being measured here, the fossil record or reality?
Preservation bias or common cause?
Many paleontologists have noticed a close linkage between the rock record and the fossil record. Some time intervals, for example, appear to be represented by thick successions of sedimentary rocks that are bursting with fossils, and so the paleontological record of that time interval is especially well documented. What if the fossil record is largely driven by the rock record?
Peters and Foote (2002) noted a close correspondence between the number of named geological formations (standard rock units; see p. 25) and the diversity of named fossils. When they plotted the patterns of appearance and disappearance of marine formations through time (Fig. 3.11a), they noted that this seemed to match the calculated rates of extinction and origination of marine organisms through time. They concluded that perhaps the appearance and disappearance of fossils was controlled by the appearance and disappearance of rocks. If this is the case, then any patterns of diversity, extinction or origination of life through time would really show a geological rather than a biological signal. In other words, the fossil record perhaps shows us little about evolution, and that would be a rather shocking and depressing observation for a paleontologist! This is the preservation bias hypothesis, the view that geology controls what we see of the fossil record, as argued by Raup in his classic 1972 paper.
If geology controls the fossil record, what lies behind the appearance and disappearance of formations? Smith (2001) showed that much of the marine rock record relates to relative global sea level. The sea-level curve for the past 600 myr (Fig. 3.11b) shows major rises and falls that reflect phases of seafloor spreading, movements of the tectonic plates, and relative ice volumes (when there are large volumes of polar ice, as at present, global sea levels are low). Smith (2001) noted that many details of the sea-level curve are mimicked by the curves for diversity of marine life (Fig.
Paleontologists have two sources of data about the history of life: the fossils in the rocks and evolutionary trees. If the evolutionary trees are produced using analytic approaches either from molecular or morphological data (see pp. 129-33), there should be no direct linkage between the ages of fossils and the shape of the tree. If that is so, then it should be useful to compare the congruence (or agreement) of fossil sequences and phylogenetic trees. If they agree, then perhaps they are both telling the correct story; if they are not congruent, then the fossils, or the tree, or both, could be telling us the wrong story.
There are a variety of metrics for comparing phylogenies and fossil records. The simplest is the Spearman rank correlation coefficient (SRC). This is a non-parametric measure that simply compares the order of two series of numbers: if the order is similar enough, the correlation coefficient is statistically significant; if not, the SRC will indicate a non-significant result. So, in the tree in Fig. 3.10a, the nodes (branching points) may be numbered 1, 2, 3 and 4 from the bottom to the clade AB or CD (we can not tell whether the node of AB comes before or after that for CD, so can use only one or other in the time series). If the oldest fossils of the clades are in sequence 1, 2, 3, 4, then it is obvious that the two series of numbers (clades and fossils) agree, and the SRC would be +1 indicating a perfect positive correlation. But what if the order of fossils was 1, 2, 4, 3? Is that a good enough agreement or not? With so few digits, the SRC test is inconclusive, but with 10 or more it can give useful outcomes. In an early study, Norell and Novacek (1992) found that 75% of mammal cladograms agreed significantly with the order of fossils. Those that failed the clade versus fossil order SRC test were groups such as primates that are suspected to have a poor fossil record.
Other metrics for comparing cladograms with geological time and fossil occurrences are the stratigraphic consistency index (SCI), the relative completeness index (RCI) and the gap excess ratio (GER).
• The SCI (Huelsenbeck 1994) assesses how well the nodes in a cladogram correspond to the known fossil record. Nodes are dated by the oldest known fossils of either sister group above the node. Each node (Fig. 3.10a) is compared with the node immediately below it. If the upper node is younger than, or equal in age to, the node below, the node is said to be stratigraphically consistent. If the node below is younger, the upper node is stratigraphically inconsistent. The SCI for a cladogram compares the ratio of the sums of stratigraphically consistent to inconsistent nodes. SCI values can indicate cladograms whose nodes are all in line with stratigraphic expectations through to cladograms that imply a sequence of events that is entirely opposite to the known fossil record.
• The RCI (Benton & Storrs 1994) takes account of the actual time spans between nodes, and of implied gaps before the oldest known fossils of lineages. Sister groups, by definition, originated from an immediate common ancestor, and diverged from that ancestor. Thus, both sister groups should have fossil records that start at essentially the same time. In reality, usually the oldest fossil of one lineage will be older than the oldest fossil of its sister lineage. The time gap between these two oldest fossils is the ghost range or minimal cladistically-implied gap. The RCI (Fig. 3.10b) assesses the ratio of the ghost range to the known range, and high values imply that ghost ranges are short, and hence that the fossil record is good.
• The GER (Wills 1999) is a modification of the RCI that compares the actual proportion of ghost range in a particular example with the minimum and maximum possible relative amount of ghost range when the cladogram shape is modified to maximize and minimize the ghost range (Fig. 3.10c, d). This then places the result in the context of all possible results, and so assesses the congruence of the tree with the fossil record, taking account of the particular cladogram shape.
These metrics can be used to assess the stratigraphic likelihood of competing cladistic hypotheses that are otherwise equally likely - in other words, if one cladogram implies very little ghost range,
Figure 3.10 Clade-stratigraphic metrics. Calculation of the three congruence metrics for age versus clade comparisons. SCI is the ratio of consistent to inconsistent nodes in a cladogram. RCI is RCI = 1(XMIG/XSRL), where MIG is minimum implied gap, or ghost range, and SRL is standard range length, the known fossil record. GER is GER = 1(MIG - Gmin)/(Gmax - Gmin), where Gmin is the minimum possible sum of ghost ranges and Gmax the maximum, for any given distribution of origination dates. (a) The observed tree with SCI calculated according to the distribution of ranges in (b). (b) The observed tree and observed distribution of stratigraphic range data, yielding an RCI of 66.0%. GER is derived from Gmin and Gmax values calculated in (c) and (d). (c) The stratigraphic ranges from (b) rearranged on a pectinate tree to yield the smallest possible MIG or Gmin. (d) The stratigraphic ranges from (b) rearranged on a pectinate tree to yield the largest possible MIG or Gmax. (Based on Benton et al. 2000.)
and the other implies a huge amount, then the former is probably more likely. Further, large samples of cladograms might give general indications about the preservation and sampling quality of different habitats or fossil groups. For example, Benton et al. (2000) found no overall difference in clade versus fossil matching for marine and non-marine organisms (despite an assumption that marine environments tend to preserve fossils better than non-marine) or between, say, vertebrates and echi-noderms. Such comparisons obviously depend on equivalent kinds of cladograms (similar sizes and shapes) within the categories being compared, or the measures become too complex.
Read more in Benton et al. (2000) and Hammer and Harper (2006), and at http://www. blackwellpublishing.com/paleobiology/.
Was this article helpful?