Solution Lifes Genesis Is Rare

The solution of the problem of life is seen in the vanishing of the problem.

Ludwig Wittgenstein, Tractatus Logico-Philosophicus

Hart's answer to the Fermi question is that life's genesis is almost miraculously rare. For practical purposes, we are alone: Earth possesses the only intelligent life — the only life — in the visible part of an infinite universe.

This miracle loses some of its gloss in an infinite Universe: an infinite number of planets possess intelligent life-forms. However, many people find it difficult to entertain the notion of an infinite Universe with an infinite number of habitable planets. Can we not instead accept part of Hart's idea? Can we dispense with the astronomical notion of an infinite Universe and argue solely from biology: perhaps life is not a miracle but nevertheless arises only rarely. Maybe the Universe appears sterile because — with the exception of one or two islands of life such as Earth — it is sterile?

As is usual with any aspect of the Fermi paradox, there are two diametrically opposed opinions. One group argues that life is indeed difficult for Nature to create. The other argues that life is almost certain to appear on a planet as soon as conditions allow. To discuss the merits of both positions, we first need to take a lengthy detour and consider the question of what we mean by life and how life might have arisen.

At school, my teacher could always drive holes through the attempts of our science class to provide a definition of life. He pointed out that, by some of our definitions, fire is alive (since it grows, it reproduces itself, and so on). On the other hand, by our definitions a mule is not alive (since it cannot reproduce itself). For the purposes of this section I will try my hand at presenting another definition of terrestrial life. My old teacher could probably still drive holes through the definition, and in any case the definition might be inappropriate in the future. (In ten years, perhaps, scientists might develop a self-aware computer. Will the computer be alive? Or a century hence, perhaps, an explorer on the Altair mission will discover an evil-smelling pink crystal that every morning turns into a goo, clinging to the sides of the spaceship and eating the metal. Is the goo alive? In both cases, under my definition the answer is "no" — even though the answer should probably be "yes." We have to begin somewhere, though, and the definition given below is as good a place as any.)

I define something to be alive if it has the following four properties.

First, a living object must be made of cells. Every living creature on Earth consists either of a single cell or a collection of cells. If we knew how cells originated, then we might well understand how life itself originated.

There are two quite different types of cell: prokaryotes and eukaryotes. Prokaryotic cells lack a central nucleus. They are simple, small and exist in variety of types. Prokaryotic organisms are hugely successful, in large measure because their simplicity means they can reproduce themselves quickly. A recent and profound discovery is that there are two quite different types of prokaryotes:213 eubacteria — or "true" bacteria (or, as I will write for simplicity, just bacteria) — and archaea. The two types of prokaryotic cell seem to bear no closer relationship to each other than they do to eukary-otic cells. Eukaryotic cells are much more complicated than prokaryotic cells; within an outer membrane lies a formidable array of biochemical machinery, and a nucleus enclosed within its own nuclear membranes. This complexity requires eukaryotic cells typically to possess 10,000 times more volume than prokaryotic cells. Eukaryotes are able to assemble to form complex, multicellular organisms — plants, fungi and animals.

figure 62 Four different types of archaea. (a) Thermoproteus tenax. Species from the Thermoproteus genus grow at 78-96° C, use hydrogen as their energy source and CO2 as their carbon source. (b) Pyrococcus furiosus. "Pyrococcus" means "fireball" — a reference to both its shape and the high temperatures at which it thrives; "furiosus" means "rushing" — it can quickly double its numbers. (c) Methanococcus igneus. Some species grow at 85° C and pressures over 200 atm; oxygen is a poison. (d) Methanopyrus kandleri. Found in high-pressure ocean depths, they can survive at 110° C.

Thus, within the living world there are three domains: archaea, bacteria and eukarya. By this definition, viruses and prions are non-living.

Second, a living object must have a metabolism. Metabolism is what we call the variety of processes enabling a cell, or a collection of cells, to take in energy and materials, convert them for its own ends, and excrete waste products. In other words, all living organisms require food of some description, and all living organisms create waste. (Fire has a metabolism, as my old science teacher would point out, but we do not have to consider fire as living since it does not meet all the other criteria.) Metabolism takes place through the catalytic action of enzymes: without enzymes, the various biochemical reactions that take place in cells simply would not happen. In turn, enzymes are made of proteins. Proteins are therefore a vital constituent

figure 63 A highly simplified sketch of the tree of life. The tree contains three domains: archaea, bacteria and eukarya. The domain ofarchaea contains three kingdoms: korarchaeota, crenarchaeota and euryarchaeota; the domain of eukarya contains, among others, the familiar kingdoms of animals and plants. The relationships between the three domains is controversial, and the diagram should not be taken too seriously — except that it shows that life on Earth possesses tremendous unity.

ore figure 63 A highly simplified sketch of the tree of life. The tree contains three domains: archaea, bacteria and eukarya. The domain ofarchaea contains three kingdoms: korarchaeota, crenarchaeota and euryarchaeota; the domain of eukarya contains, among others, the familiar kingdoms of animals and plants. The relationships between the three domains is controversial, and the diagram should not be taken too seriously — except that it shows that life on Earth possesses tremendous unity.

of life — at least here on Earth. As we shall see later, the instructions for creating the various proteins necessary for a cell's existence are contained in its deoxyribonucleic acid (DNA), while the biochemical machinery of protein synthesis is based on its ribonucleic acid (RNA). In shorthand form: DNA makes RNA makes proteins.

Third, a living object can reproduce — or else it derives from objects that could reproduce. Cells can reproduce either individually or in sexual pairs, and the mechanism of reproduction is DNA. Clearly, then, DNA plays a central role in living organisms — just how central we will come to shortly. (Crystal structures can reproduce; however, they lack the variation that occurs when living organisms reproduce. Replication, rather than reproduction, is a better term for crystal growth, and certainly we do not need to consider crystals to be alive. On the other hand, mules and other sterile organisms came from creatures that could reproduce; we do not need to classify mules as non-living.)

Fourth, life evolves. Darwinian evolution — natural selection acting on heritable variation — is a key aspect of life.

These four properties — cells, metabolism, reproduction and evolution — are enough on which to base a discussion of life, even if the definition itself could be improved. We are now in a position to ask: how did life start?

It is worth stating at the outset that nobody knows how life started. Nevertheless, in recent years tremendous progress has been made in two directions: on the one hand, tracing life's ancestry back as far as possible, and on the other hand attempting to understand the chemical pathways that might have led to the earliest forms of life. (There is at least one other promising approach: the idea that life emerged complex and whole thanks to the self-organizational properties of chemical systems. Lack of space prevents us from discussing this approach.)214

The "top-down" method of looking for the origin of life is the search for LUCA — the Last Universal Common Ancestor, from which all present life must have inherited its common biochemical structures. (There is a tremendous unity of terrestrial life: all organisms, with a few minor exceptions, use the same genetic code, which enables a sequence of DNA to specify a polypeptide; all organisms use DNA to carry genetic information; and so on.) If LUCA was sufficiently simple, if it existed at a very early stage in the history of Earth — and if we can understand LUCA in detail — then we might deduce how it came to be. Unfortunately, this approach can be taken only so far. One commonly drawn picture is that LUCA was already a sophisticated organism, which had evolved considerably from the time when life first arose, before it branched into the domains of archaea and bacteria. Later, in this picture, the eukaryotic domain branched off from the archaea. This picture is complicated enough, but as the biochemical laboratories discover new information on an almost daily basis, the picture is becoming even more convoluted. We usually think of genetic information as passing only vertically — from parent to child. Early in the history of life, however, horizontal transfer of genes between different types of organism seems to have occurred frequently. This horizontal transfer of genetic information means that simple lineages become tangled. At the time LUCA is supposed to have existed, there may have been a pool of genes (formed from a community of cells that were able to exchange genes in horizontal fashion because they shared the same genetic code), from which the three domains arose separately. In other words, archaea, bacteria and eukarya may be equally ancient. (On the other hand, there is a suggestion that the Snowball Earth event of 2.5 billion years ago produced the conditions that gave rise to the eukaryotic cell. In other words, eukarya may be relatively recent; and without a Snowball Earth event they might never have arisen.) These interesting suggestions remain an active area of research.

Rather than become bogged down in the details of LUCA, we can consider the "bottom-up" approach to the question of the origin of life. We can ask: how did the universal chemicals of life — nucleic acids and proteins— come into existence? If we can understand that, then we may be able to fill in the gap between the bottom-up and top-down approaches; we may be able to understand how inanimate matter became alive.

Nucleic Acids

If any molecule deserves the title "molecule of life," it must surely be deoxyribonucleic acid — DNA. According to the definition presented earlier, life has two key aspects: it has a metabolism, and it passes on information through the reproductive process. The DNA molecule is central to both aspects. The role it plays in synthesizing proteins, which in turn allow metabolism, is described below. Here we concentrate on the reproductive aspect and briefly consider how DNA can replicate itself — while providing enough variation upon which natural selection can work.215

The DNA molecule is a polymer of nucleotides. A nucleotide consists of three parts.

First, it possesses a deoxyribose sugar. The sugar contains five carbon atoms, conventionally numbered with primes — 1' through to 5' (pronounced "one prime," "two prime," and so on). The sugar is similar to ribose, but lacks a hydroxyl molecule at the 2' position.

Second, it possesses a phosphate group. The nucleotides can link together to form long chains through so-called phosphate ester bonds — bonds between the phosphate group of one nucleotide and the sugar component of the next nu-cleotide. The sugar-phosphate chains form the backbone of DNA; in the familiar picture of DNA as a "ladder-like" molecule, the sugar-phosphate chains form the "rails" of the ladder. A chain can be indefinitely lengthened simply by attaching more nucleotides through more ester bonds; a DNA molecule can be anywhere between about 100 to a few million nucleotides in length. No matter how long the chain becomes, there are always two ends. One end has a free -OH group at the 3' carbon (the 3' end) and the other end has a phosphoric acid group at the 5' carbon (the 5' end).

figure 64 A double helix (like dna) as shown here in a computergenerated figure.

figure 64 A double helix (like dna) as shown here in a computergenerated figure.

Third, it possesses a pair of nitrogenous bases. These form the "rungs" of the DNA ladder. A base is linked to the deoxyribose sugar at the 1' carbon. A base can be either one of the purines, adenine (a) or guanine (g), or one of the pyrimidines, cytosine (C) or thymine (t). Biochemists present the nucleotide sequence in a chain by starting at the 5' end and identifying the bases in the order in which they are linked; a typical sequence of DNA may be written as -G-C-T-T-A-G-G-.

One of the key developments in science was the realization that DNA in the nuclear material of cells has two strands, twisted around each other to form a double helix, such that one strand is always associated with a complementary strand. The base G is always opposite the base C, the base T is always opposite the base A. This complementarity occurs because only these combinations of base pairs can form hydrogen bonds between them and hold the two strands together. An individual hydrogen bond is weak, but a normal DNA molecule contains so many base pairs that the two strands are held tightly together. This complementarity also means all the information is held in a single strand of DNA

— and allows for the possibility of replication and reproduction.

The process of DNA replication begins when an enzyme called DNA he-licase partially unzips the double helix at a region known as the replication fork. At the replication fork there are two strands of DNA — one of which is the template strand. With the bases now exposed, an enzyme called DNA polymerase moves into position and begins the synthesis of a DNA strand complementary to the template. The enzyme reads the sequence of bases on the template strand, in the direction from the 3' end to the 5' end, and adds the nucleotides to the complementary strand one at a time

— always G to C and A to T. (So a sequence on the template strand of -G-C-T-T-A-G-G- would become -c-g-a-a-t-c-c- on the synthesized complementary strand, which grows in the direction from 5' to 3'.) Eventually, a complete complementary strand is formed; the DNA polymerase catalyzes the formation of the hydrogen bonds between the nucleotides on the two strands, and a new double helix can form. While this whole process takes place, a rather more complicated process manufactures a new strand that is complementary to the other original strand (or lagging strand). The net result is the creation of two identical copies of the original DNA double helix, and each new helix contains one strand of the original. We have a replication mechanism.

figure 65 The backbone of a dna molecule consists of long chains of deoxyribose sugar and phosphate groups; nitrogenous bases in each helix form bonds, but they must obey the pairing rules: adenine opposite thymine, and cytosine opposite guanine.

figure 66 The specific pairing of nucleotide bases — a with t, c with g — enables dna to replicate; it is the basis of heredity. When the twin-stranded dna molecule replicates, the two strands separate at the replication fork. Enzymes then add new bases to the two strands while following the pairing rules. The result is two molecules, both ofwhich are identical to the original.

figure 66 The specific pairing of nucleotide bases — a with t, c with g — enables dna to replicate; it is the basis of heredity. When the twin-stranded dna molecule replicates, the two strands separate at the replication fork. Enzymes then add new bases to the two strands while following the pairing rules. The result is two molecules, both ofwhich are identical to the original.

(The process outlined above is a simplified version of what actually occurs. One of the aspects I omitted is the role RNA plays in the replication of DNA. Ribonucleic acid is the other major type of nucleic acid and it, too, fulfills key functions for life on Earth. There are several differences between DNA and RNA. A structural difference is that RNA usually appears in cells as a single chain of nucleotides, rather than as a double helix of DNA; RNA molecules are also typically smaller than DNA molecules. There are two chemical differences between the molecules. First, the RNA nucleotides contain the sugar ribose rather than deoxyribose (hence the difference in names between the two molecules). Second, RNA employs the base uracil (U) rather than thymine. There is also a major functional difference between the two acids: DNA exists solely to store genetic information in the sequence of its nucleotide bases, whereas RNA molecules do things. There are several types of RNA, each performing different tasks, and we shall meet three of them — messenger RNA (mRNA), ribosomal RNA (rRNA) and transfer RNA (tRNA) — below.)

The ability of DNA to replicate is the secret of life's ability to reproduce. This ability explains why offspring look like their parents — snakes beget snakes, woodpeckers beget woodpeckers, and humans beget humans. But for life to evolve, and for species to change into other species, heredity must be imperfect. There must be some variation among offspring: natural selection cannot adapt things that do not vary. Fortunately, there is variation when DNA replicates. From time to time, a mutation occurs: there is a change in the sequence of nucleotide bases. These mutations occur ran domly from radiation damage, from chemical agents and simply from errors in the DNA replication process. (The rate of mutation is remarkably small, due to various checks that take place when DNA replicates. After the first stage of replication there are two error-correcting stages: proofreading and mismatch repair. These extra stages minimize the error rate to 1 in 109.) If an error occurs in a part of DNA that codes for a protein (more on this below), then the mutated DNA will produce a different protein. If the protein performs its intended job better than the original, then the mutation will be beneficial for the organism (and perhaps increase the probability of the organism's survival and thus, through increased numbers of offspring, of its own continued existence); but more probably, the mutation will be harmful or at least neutral. The point is that mutations give natural selection something on which to work.

If all that nucleic acids did was replicate, then they would be only marginally more interesting than self-replicating crystals. While DNA can store genetic information, it would be of little use if the information was not retrieved and put to use. It would be like having a public library stacked full of books, but with no one allowed to read any of the volumes. What makes nucleic acids so fascinating is that they code for and construct proteins. And proteins are what make life so interesting. Proteins enable life to do things.


Proteins are complicated macromolecules that exhibit tremendous versatility. They function as enzymes (which make possible a cell's metabolism), they act as hormones (thus providing a regulatory function; insulin is a common example), and they provide structure (our fingernails, hair, muscles, and the lenses in our eyes are all proteins).

A protein is a long sequence of amino acids folded into a three-dimensional structure. A particular sequence of amino acids folds into a particular structure. Change the sequence and you change the way the protein folds up — and thus the task that the protein can fulfill, since the biochemical task that a protein can carry out depends critically upon its shape in three dimensions. Proteins make use of twenty different amino acids. There are many other amino acids in Nature, and several of them are important in biology; but proteins use only twenty. All the amino acids have a common structure: an amino group (H2N), a residue or R group (CHR) and a carboxyl group (COOH). The general structure is written H2N—CHR— COOH, and the chain forms by linking the amino end to the carboxyl end by peptide bonds. (A chain of amino acids is thus called a polypeptide; a protein is simply one or more polypeptides.) What makes each amino acid unique is the R side chain: different amino acids have different R groups and thus possess different properties. For example, some side chains create an amino acid that is hydrophobic; such amino acids tend to cluster on the inside of a protein and thus play a factor in determining the three-dimensional structure of the molecule. Other side chains make an amino acid that is hydrophilic — in other words, it reacts readily with water.

figure 67 The ras protein, which acts as a molecular switch governing cell growth. Knowing the structure of this protein in three dimensions may enable scientists to devise methods of turning off the switch in cancer cells. However, computing the way in which a sequence of amino acids will fold is an extremely difficult problem.

Each amino acid is coded for by a set of three RNA nucleotide bases called a codon. Since there are four bases (A, C, G, u) there are 4 x 4 x 4 = 64 codons. In theory, then, codons could code for 64 amino acids — and yet only 20 different amino acids are used in protein synthesis. The genetic code is thus degenerate: 3 of the codons represent an "end of chain" command, and the other 61 codons code for the 20 amino acids. In other words, nearly all amino acids are coded for by several codons. (For example, the amino acid cysteine is coded for by the codons UGU and UGC; isoleucine is coded for by the codons AUU, AUC and AUA; and so on.) The genetic code is essentially universal: with only minor exceptions, all organisms on Earth use it. (Does the universality of the genetic code imply that it is the only possible code? Perhaps there were originally several different codes, and this one just happened to win out over the others. But if the present uniqueness of the code means that it arose only once in the history of life, perhaps the development of an effective code represents a difficult barrier for evolution to overcome.)

The way a cell goes about synthesizing a protein is at once wonderfully simple and marvelously intricate. A highly simplified version of the process proceeds as follows.

The information on how to build proteins — and thus an organism — is contained in the organism's DNA. First, then, when a cell receives a signal asking for it to produce a certain protein (and let us suppose the protein is a single polypeptide), the double-helix of DNA unzips in the region of the coding strand. This is like the template strand mentioned above and contains information for that particular protein. A region of DNA that codes for a polypeptide (or, more accurately, that codes for some form of RNA) is known as a gene.

An mRNA copy of the gene is made in a transcription process — so called because each triplet in the DNA strand is transcribed into the corresponding codon in mRNA. The mRNA then moves from the nuclear material to the cytoplasm of the cell, taking with it its information on amino acid sequences. Within the cytoplasm, organelles called ribosomes take the mRNA and use the information contained in the codon sequence to synthesize the protein, adding amino acids onto the growing chain. This process is called translation, since a ribosome uses the genetic code to translate from the sequence of codons into a sequence of amino acids. A key ingredient here is tRNA — small molecules, each of which can bind only to a particular amino acid. A series of enzymes is required to catalyze the binding process; each enzyme recognizes one particular tRNA molecule and the corresponding amino acid.

figure 68 The dna molecule stores genetic information, and replicates that information when a cell divides. The expression of that genetic information does not take place directly. Instead, dna is first transcribed into rna. Information stored in the "four-letter" alphabet of nucleotides (the alphabet used by rna) is then translated into the "twenty-letter" alphabet of amino acids (which are used to construct proteins). The Central Dogma of biology, first stated by Francis Crick, is that the information flow follows the direction of the arrows in this diagram. In particular, rna can synthesize proteins through translation, but reverse translation never occurs.

Protein synthesis always begins with methionine (with codon AUG) and continues until the ribosome encounters one of the stop codons (UAA, UAG or UGA), at which point the protein is released and the synthesis is over. (This provides an outline sketch of protein synthesis, at least for prokary-otic cells. In eukaryotic cells, the process is further complicated by the presence of sequences of DNA that do not code for anything. A further step is required to remove this seemingly useless information. Space here is too limited to go further into the details of protein synthesis, but there are many excellent sources available for further reading,216 and fortunately we do not need extra detail to continue the discussion.)

To recap: DNA stores genetic information and replicates the information when a cell divides. That is all it does. The messy business of actually expressing the information is left to the more versatile RNA; using the universal genetic code, information is transcribed from DNA into RNA and then translated into protein synthesis.

Was this article helpful?

0 0

Post a comment