Symmetry Properties

A remarkable property of the genetic code is the degeneracy of the last letter of a codon regarding pyrimidine exchange, that is, exchange between U and C. In other words, if a given amino acid is encoded by a codon NNU, where N stands for any of the four bases U, C, A, G, the codon NNT encodes the same amino acid (see Table 10). Still more remarkable is the fact that all known variants of the genetic code, including 10 nuclear and 16 mitochondrial ones (Watanabe and Suzuki, 2001; Knight et al., 2001), respect this symmetry. This implies that 32 codons are degenerated by the exchange of U and C in their last letter (two groups of 16 codons each are so defined). An inspection of Table 6 shows that this same symmetry is displayed by the non-power representation system (see Table 7). The binary strings xxxx01 and xxxx10 - where x stands for either binary number, 0 or 1 - always encode the same whole number: in such a way 32 degenerate binary strings are defined forming two groups of 16 strings each. It must be remarked that, in an arbitrary coding, there is no need for any connection between the degeneracy distribution (the global property of Table 2) and symmetry (a local property defining 16 degenerate pairs of codons). Furthermore, there is no biochemical reason for this degeneracy, there being identified at least one way to recognize U but not C by a specific tRNA as a third codon base; in fact, xo5U (a hydroxymethyluridine derivative) in the first anticodon position may decode U, A, or G, but not C, in the third position of a codon (Watanabe and Suzuki, 2001).

Moreover, the specific degeneracy of the representation system is a consequence of the non-power positional weights chosen, and these are univocally determined by the global degeneracy distribution of the genetic code. That is: (i) we start with a degeneracy table for the standard genetic code; (ii) we find a unique solution for a mathematical isomorphism based on a non-power positional number representation describing this degeneracy; (iii) this choice imposes an internal symmetry in the model; and finally (iv) this internal symmetry is also exhibited by all the known versions of the genetic code, closing in this way a modelling circle: the structural isomorphism is indeed a model of the genetic code as we demonstrate in the following. As a biological consequence we find that the genetic code is blind to the codon's pyrimidine last letter because of its redundancy distribution (or vice versa), a fact not noted previously (the global degeneracy of the code is uniquely linked to its internal organization).

This property thus shows the way for the mathematical modelling of the genetic code because we have established links between one half of the codons (codons ending in pyrimidine) and one half of the binary strings of the model (those of the form xxxx01 or xxxx10) sharing the same symmetry properties (see Table 11). The former result produces an immediate consequence for the attempt to complete the links between codons and strings: the 32 remaining binary strings are of the form xxxx00 or xxxx11 and this means that codons of the form NNR, where R stands for a purine (A or G), are necessarily coded by these kind of strings.

This assertion can be further refined by observing that the strings 000000 and 111111 (the degeneracy-1 strings) necessarily code the degeneracy-1 amino acids,

Table 10 Essential degeneracy in the third letter of pyrimidine-ending codons (3rd letter U or C) showed for the standard genetic code but shared by all known versions of the genetic code (10 nuclear and 16 mitochondrial)

U

C

A

G

U

UUU Phe

UCU Ser

UAU Tyr

UGU Cys

U

UUC Phe

UCC Ser

UAC Tyr

UGC Cys

C

A

G

C

CUU Leu

CCU Pro

CAU His

CGU Arg

U

CUC Leu

CCC Pro

CAU His

CGC Arg

C

A

G

A

AUU Ile

ACU Thr

AAU Asn

AGU Ser

U

AUC Ile

ACC Thr

AAC Asn

AGC Ser

C

A

G

G

GUU Val

GCU Ala

GAU Asp

GGU Gly

U

GUC Val

GCC Ala

GAC Asp

GGC Gly

C

A

G

Table 11 Equivalence between binary strings and pyri-midine-ending codons

Length-6 strings (either) Type of codon (either)

i.e. Methionine and Triptophan (see Table 12). These two amino acids are coded by codons ending in G, i.e. AUG and UGG. Thus a final G is coded in one case by a string ending in 00 and in the other by a string ending in 11. Consequently 11 or 00 endings alone do not suffice for determining the G or A ending character of a codon. But if we take into account the parity of the strings this indeterminacy can be resolved. Recall that parity can be defined as the parity of the number of ones in a string, for example, 000000, has 0 ones being thus even; 111111 has 6 ones being also even.

Thus we can assert that strings ending in 00 or 11 with even parity code a final G in the corresponding codon, and that, by exclusion, the A ending codons are coded by strings ending in 00 or 11 but with odd parity (see Table 13).

Following the degeneracy table of the genetic code, the pair of degeneracy-3 elements also needs to be univocally assigned. This pair corresponds in the standard code to Ile and Stop. But we can observe that a more symmetric case corresponds to the euplotid nuclear version of the genetic code in which the Stop codon UGA is assigned to Cysteine.

Thus, the degeneracy-3 pair of amino acids is composed of Cysteine and Isoleucine as shown in Table 14. For the euplotid nuclear version degeneracy-1 and

Table 12 G-ending codons corresponding to the degeneracy-1 amino acids Trp and Met (this last representing also the start signal)

U

C

A

G

U

TTT Phe

TCT Ser

TAT Tyr

TGT Cys

U

TTC Phe

TCC Ser

TAC Tyr

TGC Cys

C

TTA Leu

TCA Ser

TAA Stop

TGA Cys

A

TTG Leu

TCG Ser

TAG Stop

TGG Trp

G

C

CTT Leu

CCT Pro

CAT His

CGT Arg

U

CTC Leu

CCC Pro

CAC His

CGC Arg

C

CTA Leu

CCA Pro

CAA Gln

CGA Arg

A

CTG Leu

CCG Pro

CAG Gln

CGG Arg

G

A

ATT Ile

ACT Thr

AAT Asn

AGT Ser

U

ATC Ile

ACC Thr

AAC Asn

AGC Ser

C

ATA Ile

ACA Thr

AAA Lys

AGA Arg

A

ATG Met

ACG Thr

AAG Lys

AGG Arg

G

G

GTT Val

GCT Ala

GAT Asp

GGT Gly

U

GTC Val

GCC Ala

GAC Asp

GGC Gly

C

GTA Val

GCA Ala

GAA Glu

GGA Gly

A

GTG Val

GCG Ala

GAG Glu

GGG Gly

G

Table 13 Equivalence between binary strings and purine-ending codons, which also takes into account the string's parity (see text)

Length-6 strings (either)

Parity

Type of codon

x x x x 1

1

Odd

N N A

x x x x 0

0

Even

N N G

degeneracy-3 amino acids are associated in pairs defining entirely symmetric quartets (Table 15).

In turn, these quartets are related by a degeneracy-conserving transformation that consists of the exchange of UeA in the first letter and UoG in the second, as shown also by the arrows in Table 15. As there are only two degeneracy-3 amino acids, the corresponding codons can be assigned to the strings defining the degeneracy-3 numbers in the non-power model following the rules developed up to now, as shown in Table 13. In so doing, an important symmetry of the genetic code having a precise mathematical counterpart is evidenced: palindromic symmetry. In the string space this symmetry is shown by a complement to one operation, that is a 0o1 and 1o0 exchange in the binary digits. The fact that this operation changes strings of one of the quartets into strings of the palindromic associated quartet can be easily demonstrated by remembering, first, that the two amino acids of degeneracy 1 are assigned to the strings 111111 and 000000, which are evidently complementary as shown in Table 16 and, secondly, by inspection of the strings shown in Table 17.

The palindromic symmetry involves not only the two quartets analysed before but also, as is shown below, every quartet of the genetic code. This powerful

Table 14 Degeneracy-3 amino acids in the euplotid nuclear version of the genetic code

Table 14 Degeneracy-3 amino acids in the euplotid nuclear version of the genetic code

U

C

A

G

U

TTT Phe

TCT Ser

TAT Tyr

TGT Cys

U

TTC Phe

TCC Ser

TAC Tyr

TGC Cys

C

TTA Leu

TCA Ser

TAA Stop

TGA Cys

A

TTG Leu

TCG Ser

TAG Stop

TGG Trp

G

C

CTT Leu

CCT Pro

CAT His

CGT Arg

U

CTC Leu

CCC Pro

CAC His

CGC Arg

C

CTA Leu

CCA Pro

CAA Gln

CGA Arg

A

CTG Leu

CCG Pro

CAG Gln

CGG Arg

G

A

ATT Ile

ACT Thr

AAT Asn

AGT Ser

U

ATC Ile

ACC Thr

AAC Asn

AGC Ser

C

ATA Ile

ACA Thr

AAA Lys

AGA Arg

A

ATG Met

ACG Thr

AAG Lys

AGG Arg

G

G

GTT Val

GCT Ala

GAT Asp

GGT Gly

U

GTC Val

GCC Ala

GAC Asp

GGC Gly

C

GTA Val

GCA Ala

GAA Glu

GGA Gly

A

GTG Val

GCG Ala

GAG Glu

GGG Gly

G

Table 15 Degeneracy conserving transformation, i.e. exchange U^A in the first letter and U^G in the second letter of the codon. The transformation connects two quartets involving the two degeneracy-1 (Met and Trp) and the two degeneracy-3 (Ile and Cys) amino acids (euplotid nuclear version)

U

C

A

G

U

TTT Phe

TCT Ser

TAT Tyr

TGT Cys

U

TTC Phe

TCC Ser

TAC Tyr

TGC Cys

C

TTA Leu

TCA Ser

TAA StOp/

TGA Cys

A

TTG Leu

TCG Ser

TAG SOP/

TGG Trp

G

C

CTT Leu

CCT Pro

CAT His

CGT Arg

U

CTC Leu

CCC Pro

CAC His

CGC Arg

C

CTA Leu

CCA Pro

CAA Gln

CGA Arg

A

CTG Leu

CCG Pro

CAG Gln

CGG Arg

G

A

ATT Ile

/CTfh/

AAT Asn

AGT Ser

U

ATC Ile

/-AgcThr

AAC Asn

AGC Ser

C

ATA Ile

ACA Thr

AAA Lys

AGA Arg

A

ATG Met

ACG Thr

AAG Lys

AGG Arg

G

G

GTT Val

GCT Ala

GAT Asp

GGT Gly

U

GTC Val

GCC Ala

GAC Asp

GGC Gly

C

GTA Val

GCA Ala

GAA Glu

GGA Gly

A

GTG Val

GCG Ala

GAG Glu

GGG Gly

G

symmetry connects quartets with the same degeneracy distribution and strings related by the complement-to-one operation. We postpone the complete analysis of this symmetry until the discussion of the placement of the remaining amino acids into the number representation map shown in Table 7.

Table 16 Binary strings corresponding to G-ending codons of the quartets AU and UG

#

Length-6 strings

Parity

Type of codon

Amino acid

Met

0

0

0

0

0

0

0

Even

A

U

G

23

1

1

1

1

1

1

Even

U

G

G

Trp

Table 17 Binary strings assigned to the degeneracy-3 amino acids following the rules previously established: xxxx01 or xxxx10 are U- or C-ending codons; codons ending in A correspond instead to xxxx00 or xxxxll strings with odd parity (an odd number of ones). Observe that the sum of the numbers represented equals 23 and that this corresponds to the complement-to-one operation at the binary strings level (palindromic symmetry)

Table 17 Binary strings assigned to the degeneracy-3 amino acids following the rules previously established: xxxx01 or xxxx10 are U- or C-ending codons; codons ending in A correspond instead to xxxx00 or xxxxll strings with odd parity (an odd number of ones). Observe that the sum of the numbers represented equals 23 and that this corresponds to the complement-to-one operation at the binary strings level (palindromic symmetry)

#

Length-6 strings

Parity

Type of codon

Amino acid

0

0

1

1

0

1

Odd

A

U

U

7

0

0

1

1

1

0

Odd

A

U

C

Ile

0

1

0

0

0

0

Odd

A

U

A

1

1

0

0

1

0

Odd

U

G

U

16

1

1

0

0

0

1

Odd

U

G

C

Cys

1

0

1

1

1

1

Odd

U

G

A

Was this article helpful?

0 0

Post a comment