Eerie perfection

The understanding of the genetic code was, after the elucidation of the structure of DNA with its four bases and famous double helix, the next triumph in the field of molecular biology. As already noted, proteins are built from the twenty available amino acids,22 although it has long been known that particular examples, such as the protein collagen that goes to form such structures as tendons (Achilles' heel) or the silk proteins that form the spider's web, are enriched in particular amino acids which reflect, in ways that even now are not completely understood, the functional and structural properties of these and other proteins. Thus collagens are enriched in such amino acids as proline, while spider-silks possess notable quantities of alanine and glycine.

Each of the amino acids is coded for by a set of three nucleotide base pairs, accordingly known as a triplet. The original code is, of course, stored in the DNA of the chromosomes, but the actual synthesis of the amino acids occurs through the agency of the RNA in minute structures within the cell known as the ribosomes. Thus, in RNA the four bases are adenine (A), cytosine (C), guanine (G), and uracil (U), the last of which substitutes for thymine (T), which is found in DNA only. With a triplet code and four base pairs there are of course 64 possible combinations. This implies that with only 20 amino acids there is a considerable degree of redundancy, even with the assignment of certain codons to signal 'Start' and 'Stop'. In fact we see that only two amino acids (methionine (abbreviated M) and tryptophan (W)) rely on a single codon each (respectively coded for by AUG and UGG), whereas the remaining 18 amino acids are able to call upon from two to six codons. (For example, histidine (H) uses either CAC or CAU; arginine (R) employs CGU, CGC, CGA, CGG, AGA, and AGG.) It has long been known that this redundancy means that mistakes in coding may not be detrimental; if a substitution within the codon fails to result in the identical amino acid, it stands a good chance of producing another amino acid with similar properties. Amino acids with similar properties, of which their affinity to or repulsion from water (the property of polarity) is particularly important, also tend to have similar pathways of biosynthesis. Here, too, if errors occur then the mistake need not be lethal. For these and other reasons, therefore, it is clear that the genetic code is excellently adapted to the needs of reliably providing the amino acids that underpin protein construction.

But how good is good? The rule of thumb in evolution is 'good enough to do the job in most circumstances', but not to waste time building a Rolls-Royce of an organism, or, to put it more flippantly, no supersonic albatrosses. Even so, measuring this 'goodness' for purpose is not so easy: organisms themselves are rubbery, slippery, and pliable and non-invasive techniques of investigation are time-consuming and often difficult. One way to address this problem is to look at the design tolerance of an organism, that is, to see the margins of safety built into such a structure as a bone. A powerful analogy, as Jared Diamond reminds us,23 is to think of a lift in a prestigious building dedicated to the serious accumulation and worship of money. 'Room for one more', says the lift attendant, before the cage shuts, shoots skywards towards the 59th floor, which it never reaches because at the 48th floor the cable snaps ... Such instances are, in the absence of malice, mercifully rare because the safety factor of such a lift cable, measured as the ratio between its ultimate capacity and maximum load imposed in normal use, is almost 12 times. The equivalent ratio for a cable in a dumb waiter ascending with its cargo of brown Windsor soup and claret is about five; for a bridge, engineers are content to allow a safety factor of only two. In this last case, however, Henry Petroski reminds us that the safety factors for some modern bridges may in reality be perilously small.24

It is perhaps not surprising that by and large the safety factors adopted by organisms25 are closer to those of the dumb waiter and the bridge. Thus the silk dragline of a spider has a modest safety factor of only 1.5, whereas the factor for the leg bone of a kangaroo hopping through the Australian outback is 3. There is an additional and quite important point that many safety factors may in themselves be sub-optimal - spider silk does snap and kangaroos can break their legs - but the margins of safety are necessarily a compromise between strength and many other vital functions in the organism. Even so, over-design does provide an important safety margin, especially when an organism encounters an unpredictable and rare circumstance. In assessing this and other reasons for such safety margins Carl Gans also makes the point that such tolerances may facilitate the occupation of a hitherto untested adaptive zone.26 One of the examples he gives is the New Zealand parrot known as the kea. This is a fascinating bird with highly adaptable feeding habits. The kea also has a penchant for trashing cars, and its behavioural characteristics include delinquent gangs of young birds.27 In passing I should also mention that notwithstanding the overwhelming evidence for adaptation and functional demands faced by organisms there remain some examples of structures whose significance still baffles biologists. John Currey gives a nice example in the form of the rostral bone in the snout of Blainville's beaked whale (Mesoplodon densirostris).28 As the species name suggests, this bone is incredibly dense, but why? One can speculate that it might be employed in fighting, but this rostral bone is very brittle, a consequence of its very low organic content. Alternatively, it might act as ballast,29 but Currey is candid when he writes, 'At the moment, its function, in this rarely found whale, is a mystery'.30

By this stage you will be wondering what possible connection could exist between the safety factors of a kangaroo, let alone the rostral bone of a rare whale, and the efficiency of the genetic code. The point, simply, is that given the realities of the physical world and adaptation, organisms and their components should be designed to do the job adequately, but no more. Humans shudder at the prospect of hurtling to their doom down a lift shaft, and so incorporate a safety margin that seems to be found very seldom in organisms. And at first sight this is what we should see in the genetic code: it certainly isn't random; in fact it is really rather good. But in recent years a group of molecular biologists, notably Steve Freeland and Laurence Hurst, have been trying to arrive at a more precise answer.31

Their approach is computer-based, and the basic aim is to randomize the genetic code and then compare the efficiency of a certain fraction of the vast number of alternative codes the computer can generate with the real one, here on Earth. There is, of course, the implicit assumption that a genetic alphabet composed of two base pairs (that is AT/CG),32 as well as the system of triplet codons and the 20 amino acids33 available for protein construction found in all terrestrial life represents some sort of norm. Alternatives to codon usage and the number and type of amino acids can, of course, be envisaged, but Arthur Weber and Stanley Miller have gone so far as to suggest that 'If life were to arise on another planet, we would expect that ... about 75% of the amino acids would be the same as on the earth.'34 Naturally we need to be cautious in assuming that even if proteins are universal they necessarily depend on the terrestrial mechanism of codons35 and the same battery of amino acids. Yet there still may be constraints. Codons built as doublets, i.e. only two base pairs (e.g. AA or AU) to code for an amino acid, would probably be rather vulnerable, while quartet or quintet (e.g. AAAA or AUAUA) codons might be getting cumbersome. There are, of course, many more amino acids known than are actually employed in the proteins and, as we shall see (Chapter 3), some of these are best known from meteorites and have no biological equivalents. Even so, given that the simplest amino acids (such as glycine, serine, and alanine) are probably the most readily synthesized anywhere in the Universe, it is possible that they predispose the biosynthetic pathways that lead to the more complex amino acids.36 So, perhaps both the genetic code and protein construction 'out there' are not so very different.

There is, however, a second difficulty in deciding just how effective the terrestrial code might be. This is because randomizing the existing genetic code leads to an astronomical number of alternative possibilities: Freeland and his co-workers suggest a figure of about 1018, which, as they helpfully remind us, is ten times as many seconds as have elapsed since the formation of the Earth. It is another big number (see note 11), and echoes the point I raised in discussing the essay by Smith and Morowitz (see note 10), that with the immensity of a protein, or in this case, genetic 'hyperspace', it would not only be a priori exceedingly unlikely that any two biospheres - separated also by a gulf of many light years - would arrive at the same evolutionary solution, but it would be even more fantastically improbable that the solution achieved was not only good (the process of natural selection should see to that) but in fact the very best. Yet, this appears to be the implication in the work by Freeland and his colleagues.

Their work, as is customary, has proceeded in several stages. Well aware of the preceding work already indicating the general efficiency of the genetic code, they examined a million alternative codes (Figure 1.3). To the first approximation the distribution can be compared to the familiar bell-shaped curve that, it is said, describes the distribution of human intelligence (IQ): a few stupid people and equally few geniuses, with most of us somewhere in the middle. So, too, with the distributions of alternative genetic codes: there is a wide range of efficiencies; some alternatives are extremely inefficient ('disastrous') and, perhaps not surprisingly, the majority are quite efficient but not

2.00 5.00 8.00 11.00 13.40

Relative efficiency of code figure 1.3 Eerie perfection. The relative efficiency of randomized genetic codes, ranging from disastrous on the right to increasingly competent to the left. Note the approximately bell-shaped curve: most codes are pretty good, a few terrible, and a few very good. Also note where this planet's genetic code falls: far, far to the left. (Reproduced with permission from Journal of Molecular Evolution, from the article The genetic code is one in a million, by S.J. Freeland and L.D. Hurst, vol. 47, pp. 238-248, fig.7; 1998, copyright Springer-Verlag, and also with the permission of the authors.)

remarkably effective. Very few of the alternatives are really impressive, but note where in Figure 1.3 the real or natural code falls. Freeland and Hurst have difficulty in keeping the surprise out of their report, even given the proviso that their approach necessitates a number of assumptions. They write: 'the natural genetic code shows startling [my emphasis] evidence of optimization, two orders of magnitude higher than has been suggested previously. Though the precise quantification used here may be questioned, the overall result seems fairly clear: under our model, of 1 million random variant codes produced, only 1 was better ... than the natural code - our genetic code is quite literally "1 in a million".' 37

This result, however, needs to be put into a wider context, because the million (106) alternatives that Freeland and Hurst looked at is only a small fraction of the total number of possibilities, which, as already noted, they estimate to be about 1018. On this basis there could still be an astronomically large number of alternative genetic codes, each of which in its 'local' context could also prove to be very good indeed when compared to a randomly chosen set of a million other codes. In their analysis of the million alternatives Freeland and Hurst specifically noted that the one code that in principle might be better than the natural one had, as one might expect, little similarity to the one used by life on Earth. It seems, however, that the potential figure of 1018 alternatives is, in reality, inflated. This is because not all the biosynthetic pathways used to construct the 20 different amino acids are in themselves viable. In a subsequent analysis Freeland and his co-workers suggest that the number of alternative codes that overall are realistically functional is relatively small. They estimate that this number might be about 270 million; and taking into account the similarities between certain amino acids they conclude, again in my opinion startlingly, 'that nature's choice [on Earth] might indeed be the best possible code'.38

In one way we should hardly be surprised at the efficiency of the genetic code.39 It is difficult to believe that the genetic code is not a product of selection, but to arrive at the best of all possible codes selection has to be more than powerful, it has to be overwhelmingly effective. The reason for saying this is that with some minor, and evidently secondary, exceptions,40 the genetic code is universal to life: you, the primrose on the table, and the bacteria in your gut all employ the same code. The earliest evidence for life is about 3.8 billion years ago and these forms are presumably directly ancestral to all groups still alive today. If so, this indicates that whatever changes occurred as the genetic code evolved towards its stable state must have been achieved still earlier; the genetic code would not otherwise be universal. Yet, as we shall see (Chapter 4), life itself may not be older than about 4 billion years. Two hundred million years (and possibly much less) to navigate to the best of all possible codes, or at least from the 270 million alternatives? Part of the explanation, as is so often the case in evolution, may be to look for a step-like arrangement: once one stage is achieved, other things then become so much more likely.41 Yet, there is also a sense that given a world of DNA and amino acids, then perhaps the genetic code we know is more or less an inevitable outcome. And if this is true, then what else might be inevitable, both here on Earth and elsewhere?

This is not the only way to look at inevitabilities in evolution. The argument from the genetic code looks to a potentially gigantic 'hyperspace' of alternative possibilities, yet the evidence suggests that rapidly and with extraordinary effectiveness a very good, perhaps even the best, code is arrived at. It is as if the Blind Watchmaker takes off her sunglasses and decides to visit her brother Chronos. Off she sets, crossing streets roaring with traffic driven by psychotics, through the entrails of the subway system of a megalopolis, and, after catching a series of intercontinental express trains with connection times of two minutes each, she arrives at Chronos' front door at 4 p.m. prompt, just in time for a relaxing cup of tea.

Was this article helpful?

0 0

Post a comment