Introduction

As Crick et al. (1976) noted, 'the origin of protein synthesis is a notoriously difficult problem'. This remark refers to protein synthesis in translation using the genetic code. When thinking about difficult evolutionary transitions (cf. Maynard Smith and Szathm√°ry, 1995) it is rewarding to break down the problem into steps that are more readily soluble by evolution and easier to understand for us. The idea

1 Biological Institute, Eotvos University, Budapest

2 Collegium Budapest, Institute for Advanced Study, 2 Szentharomsag utca, H-1014 Budapest, Hungary

3 International Centre for Genetic Engineering and Biotechnology, Padriciano 99, 34012 Trieste, Italy

4 Bioinformatics Group, Biological Research Center, 6726 Szeged, Hungary

5 Animal Ecology Research Group of HAS, Hungarian Natural History Museum, Budapest

6Parmenides Center for the Study of Thinking, 14a Kardinal-Faulhaber-Str, D-80333 Munich, Germany

M. Barbieri (ed.), The Codes of Life: The Rules of Macroevolution. © Springer 2008

of an RNA world (e.g. Gilbert, 1986) is important because it separates the problem of life's origin from the origin of translation. The origin of the genetic code by itself seems to be burdened by a dual difficulty: no meaningful proteins without the genetic code, and no genetic code without the appropriate proteins (especially synthetases). Fortunately, there is a way out: we know that selected RNA molecules (aptamers) can specifically bind amino acids, and charge them to RNA either in cis or in trans (discussed below). The fact that peptidyl transfer in the ribosome is catalysed by RNA rather than proteins (Moore and Steitz, 2002, Steitz and Moore, 2003) supports the view that there was a way out from the RNA world into ours, aided by RNA itself.

Yet these exciting developments leave the nature of positive selection for the genetic code obscure. Polypeptides must attain a critical size and complexity before they can serve structural and catalytic functions in a rudimentary way. Soding and Lupas (2003, p. 837) called attention to the fact that, consonant with the RNA world scenario, the first polypeptides (supersecondary structures) would have been unable to attain stable conformation by themselves, as witnessed even by contemporary ribosomal proteins: 'The peptides forming these building blocks would not in themselves have had the ability to fold, but would have emerged as cofactors supporting RNA-based replication and catalysis (the 'RNA world'). Their association into larger structures and eventual fusion into polypeptide chains would have allowed them to become independent of their RNA scaffold, leading to the evolution of a novel type of macromolecule: the folded protein.' The path from the RNA world presumably went through a marked RNA-polypeptide phase. Corollary to this is the notion that modern metabolism is a 'palimpsest' of the RNA world (Benner et al., 1989) and that evolution of modern protein aminoacyl-tRNA syn-thetases may shed little direct light on the very origin of the genetic code if the ancient form of the code was implemented by ribozymes rather than proteins; at most the evolution of protein synthetases could have been partly analogous to that of RNA synthetases (Wetzel, 1995).

What could have been the force that drove life out of the RNA world? The end result is clear: proteins in general are much more versatile catalysts than RNA, partly by the virtue of the greater catalytic potential of 20 amino acids as opposed to 4 nucleotides (e.g. Szathmary, 1999). In general, replicability and catalytic potential are in conflict: they prefer smaller and larger alphabets, respectively (Szathmary, 1991, 1992). But evolution has no foresight: one cannot rationalize a transition by noting that the end result is fitter than the starting point: a more or less smooth path on the adaptive landscape must be found.

Some time ago, one of us proposed an idea (Szathmary, 1990) how this could have been possible by still keeping catalysis by amino acids in focus. In short, some amino acids could have been utilized as cofactors of ribozymes in a metabolically complex RNA world. According to this scenario, amino acids were linked to specific short oligonucleotides (called handles) by ribozymes, in a manner that followed the logic of the genetic code: one type of amino acids was allowed to be charged to different handles, but each particular handle with a specified sequence was charged with one type of amino acids only. Szathmary (1990) proposed that this arrangement was very functional in ribo-organisms because the many different ribozymes in metabolism could have specifically bound the necessary amino acid cofactors by their handles using a straightforward base-pairing mechanism, and that the burden of accurate direct amino acid recognition could have been taken on by a few ribozymes charging amino acids to their cognate handles (Fig. 1): only as many specific charging ribozymes were required as there were different types of amino acids in this system.

Needless to say, the charging ribozymes are taken to be analogous to modern aminoacyl-tRNA synthetases. The most direct experimental evidence so far for the cofactor idea was found by Roth and Breaker (1998): the very efficient role of his-tidine in aiding the activity of a selected DNA enzyme that cleaves RNA.

While we think this central idea holds, the original exposition suffered from several shortcomings that were rectified later (Szathmary, 1993, 1996, 1999). Szathmary (1990) believed that the first adaptors were simple nucleotides that grew through evolution to trinucleotides then later to tRNA molecules, missing the problem that

Fig. 1 Scheme of the coding coenzyme handle (CCH) hypothesis. Amino acids are N-linked to anticodonic hairpins by synthetase ribozymes. The products are recognized by complementary loops, embedded in ribozymes that use the linked amino acids as coenzymes. The case shown involves the metabolically prominent histidine

Fig. 1 Scheme of the coding coenzyme handle (CCH) hypothesis. Amino acids are N-linked to anticodonic hairpins by synthetase ribozymes. The products are recognized by complementary loops, embedded in ribozymes that use the linked amino acids as coenzymes. The case shown involves the metabolically prominent histidine even trinucleotides are too short to bind well by conventional Watson-Crick base pairing to a complementary sequence in a ribozyme. The minimal, and sufficient, requirement is the interaction of two 'kissing' hairpins that provides sufficient accuracy of binding and residence time (Szathmary, 1996). Another problem concerns the nature of the bond between amino acids and handles. This bond today is a labile anhydride bond, which is good for protein synthesis but bad for keeping amino acids bound to tRNAs for long. In contrast, stable binding of amino acids to their handles must have been a requirement in the ancient world. It is at this stage that Wong's (1991) ideas step in. He also saw the advantage of amino acids as catalytic aids in the RNA world, and argued in favour of selection for RNA peptidation. In his scenario, peptides consisting of several amino acids could have been linked to one tRNA-like molecule, which makes the logic of coding difficult to appear (T.-F. Wong, personal communication, Budapest, 1996). But Wong very correctly identified the stability problem proposing that amino acids could have been ^-linked to some bases, as in contemporary modified bases in tRNA, for example. In fact, this goes back to the old suggestion of Woese (1972) who noted that nucleotide 37 (adjacent to the anticodon) in tRNA was always modified, and that this site could have been the one to which in ancient times amino acids were attached; but nowadays he is 'worried about the energetics of this reaction' (C. Woese, personal communication, email, 2006); we shall come back to this issue later (there seems to be a solution). Important is that Szathmary (1999) adopted the stable ^-linkage in his coding coenzyme handle (CCH) scenario, which comes at a price that one has to explain the origin of relocation of charging from position 37 to the 3'-end of tRNA by a different (labile) chemical bond (L. Orgel, personal communication, Stockholm, 1977).

Curiously, Szathmary has never made a serious conjecture of amino acid entry order into the genetic code (which is surely an exciting problem; see e.g. Di Giulio and Trifonov, Chapter 4, this volume), or about the nature of the gradual building up of oligo- and polypeptides (e.g. Di Giulio, 1996). In this chapter we complement the original CCH scenario by these important considerations. We propose that amino acids were first introduced into the genetic code (by ribozymes) according to their catalytic importance (propensity). Later, other amino acids were introduced to allow the formation of b-turns (Jurka and Smith, 1987) and ^-sheets (Di Giulio, 1996). Notably, a-helices arrived later: probably at around the same time when P-sheets came. The important point is that patterns (partly identified here for the first time) of the genetic code are consistent with this interpretation. The three protein features that correlated with the columnar organization of the genetic code are catalytic propensity, b-turn propensity and b-sheet propensity. We believe that Nature tells us something with this.

In this chapter we first reveal a new statistically significant pattern of the catalytic propensity of amino acids and columns of the genetic code. Considerations for the primitive ancestry of the anticodon arm of tRNA as an ancient acceptor of amino acids follows next. Then we discuss the appearance of the first oligopeptides with a novel network analysis of amino acid substitutions. Finally, we propose some experiments that could lend support to some of the evolutionary steps suggested by the CCH hypothesis.

Was this article helpful?

0 0

Post a comment