The Sugar Code Basic Principles

In order to succeed as hardware for information transfer any substance class must offer the potential for specific coding. The message will have to be deciphered with sufficient biochemical affinity and low probability for ambiguities and misinterpretation. A high-density coding capacity is beneficial to keep the size of the active sections of biomolecules small, thereby reducing the energetic expenses during synthesis. Moreover, spatially easy accessibility and the potential for rapid structural modulations by reversible variations of the chain length and/or introduction of small but decisive substituents are eminent factors in the design of an efficient code system. This set of conditions describes the frame in which the quality of biological coding is to be rated. By performing such calculations on the theoretical storage capacity expressed as the total number of isomers without preconceptions it takes no effort of persuasion to convincingly show that nucleotides and amino acids are surpassed, by far, by another class of biomolecules.

Currently, carbohydrates have their main place in textbooks in chapters on energy metabolism and cell wall composition. The regular repetitive arrangement of monosaccharides in plant, insect, fungal, or bacterial cell walls or coats seduces to underestimate the other inherent talents of carbohydrates. Amazingly, they are readily discernible when closely looking at a simple structural representation (Fig. 1). Each monosaccharide offers various hydroxyl groups for oligomer formation by glycosidic bonds including the anomeric Cl-position. In contrast to nucleic acids and proteins branching of chains is a common feature of the glycan part of cellular glycoconjugates (glycoproteins, glycolipids). Taking stock of the peculiarities of monosaccharide structure the total number of isomer permutation for a hexamer with an alphabet of 20 letters (monosaccharides) reaches the staggering number of 1.44 x 1015 (Laine, 1997). Under the same conditions only 6.4 x 107 (206) structures can be devised from 20 amino acids, the four nucleotides just yielding 4096 (46) hexanucleotides. Allowing two different substitutions in a hexasaccharide, occurring in Nature, e.g. as sulfation in glycosaminoglycan chains, further increases

Fig. 1 The different graphic representations of the structure of a hexapyranose using a-D-glucose (Glc) as example (top). Commonly, the Haworth formula (middle) with the ring being placed perpendicular to the plane is given preference to the traditional Fischer projection of the hemia-cetal (left). The relative positioning of the axial and equatorial substituents can readily be visualized by drawing the relatively rigid and energetically privileged chair conformation (right). For the formation of an acetal (disaccharide) by a glycosidic bond using D-galactose (Gal), the 4'-epimer of glucose as example, the anomeric hydroxyl group of the left monosaccharide can theoretically react with any of the five acceptors present on a second hexopyranose yielding 11 isomers with full consideration of the two anomeric positions (bottom). The structure of the p1-3-linked diga-lactoside is drawn in Fig. 3

Fig. 1 The different graphic representations of the structure of a hexapyranose using a-D-glucose (Glc) as example (top). Commonly, the Haworth formula (middle) with the ring being placed perpendicular to the plane is given preference to the traditional Fischer projection of the hemia-cetal (left). The relative positioning of the axial and equatorial substituents can readily be visualized by drawing the relatively rigid and energetically privileged chair conformation (right). For the formation of an acetal (disaccharide) by a glycosidic bond using D-galactose (Gal), the 4'-epimer of glucose as example, the anomeric hydroxyl group of the left monosaccharide can theoretically react with any of the five acceptors present on a second hexopyranose yielding 11 isomers with full consideration of the two anomeric positions (bottom). The structure of the p1-3-linked diga-lactoside is drawn in Fig. 3

the number of isomers by more than two orders of magnitude (Laine, 1997). In the prophetic words of Winterburn and Phelps, "carbohydrates are ideal for generating compact units with explicit informational properties, since the permutations on linkages are larger than can be achieved by amino acids, and, uniquely in biological polymers, branching is possible" (Winterburn and Phelps, 1972).

It is no treading on thin ice to follow the authors to their conclusion that "the significance of the glycosyl residues is to impart a discrete recognitional role on the protein" (Winterburn and Phelps, 1972), and it is not surprising that at least 1.0% of the translated genome in animals is devoted to the generation of code words with as many as 70% of proteins harboring the tripeptide sequon for N-glycosylation (Reuter and Gabius, 1999; Varki and Marth, 1995; Wormald and Dwek, 1999). The core region and complex extensions of this ubiquitous type of protein glycosylation in eukaryotes are shown in Fig. 2. It gives a graphic example how branching sets in and how to read the sugar code. Each linkage is characterized by the anomeric configuration and the positions of the two linkage points, such as (31-4 as opposed to a1-4 or a1-3. Since nucleotide sugars are employed as donors by the glycosyl-transferases (Brockhausen and Schachter, 1997; Sears and Wong, 1998), chain growth generally involves the anomeric position restricting the range of products by enzymatic synthesis in relation to all theoretically possible isomers. Nonetheless, the presented staggering complexity of glycan structures has already placed severe obstacles in the way to go beyond merely acknowledging the enormous potential for structural variability towards precise structure determination.

These problems have mainly been solved by the development of sophisticated isolation and analysis methods combining the power of liquid chromatography, capillary zone electrophoresis, mass spectrometry, and NMR spectrometry with that of biochemical reagents such as endo- and exoglycosidases and sugar receptors (Cummings, 1997; Geyer and Geyer, 1998; Hounsell, 1997; Reuter and Gabius, 1999). Application of these techniques has revealed that subtle variations and

Fig. 2 Structure of the core pentasaccharide of N-glycans given within the frame and the additional branching yielding a penta-antennary complex-type sugar structure (left)

modifications are especially frequent in the terminal, spatially accessible sections of the sugar antennae. The strategic placement of distinctive substitutions is expected for a role in information transfer. They are marked by introduction of small substituents (sulfate and 0-acetyl groups, etc.) into sugar moieties such as ^-acetylgalactosamine or N-acetylneuraminic acid, comparable to the formation of an umlaut in the German language, or by directing a synthetic intermediate to various end products by mutually exclusive refinements, e.g. al-3 fucosylation, a2-3/6 sialylation, and 4-sulfation (Hooper et al., 1997; Reuter and Gabius, 1996, 1999; Reutter et al., 1997; Sharon and Lis, 1997; Varki, 1996). Intercellular and temporal flexibility turns the available letter repertoire into an array of alternative structures (biosignals). Indeed, the observations that the profile of glycans is not genetically strictly coded but influenced by the presence and relative positioning of the set of enzymes in the assembly line and the actual availability of activated substrates such as nucleotide donors argues in favor of purpose versus randomness (Abeijon et al., 1997; Pavelka, 1997; Varki, 1998). Thus, the prerequisite for rapid and multifarious modulation mentioned in the introductory paragraph is adequately fulfilled in the sugar code.

In view of the assumed importance for maintaining diversity, a multicellular organism with lack of presence of one of the mentioned pathways will allow to probe into the question whether this deficit is accompanied by any remodeling in the overall glycosylation system or not. Assisted by genome sequencing, it can indeed be proposed that absence of sialylation in the nematode Caenorhabditis elegans might be compensated by elaboration of another part of the enzymatic machinery. The discovery of 18 different genes for putative fucosyltransferases in the genome of this nematode argues in favor of this notion (Oriol et al., 1999). In these authors' own words, "for some unknown reasons, these nematodes have favored through evolution fucosylation instead of sialylation of their terminal nonreducing oligosaccharide epitopes or glycotopes and since sialic acid and fucose are usually in competition for the same acceptors, the lack of all forms of sialic acid in C. elegans fits well with a large expression of different fucosyl-transferase genes, making this animal an ideal model for evolutionary studies of fucosyltransferases" (Oriol et al., 1999). All these reactions in glycosylation result in a typical pattern of glycan chains on the level of cells and organs. It is as characteristic as a fingerprint or a signature. Yeast cells, for example, produce mannose-rich surface glycans, while multicellular organisms prominently put histo-blood group epitope-rich complex-type glycans on display. Enzymes for these extensions at the end of antennae (Fig. 2) typically reside in the medial- and trans-Golgi regions. Since the number of activities operating upon these sections has especially expanded in the animal kingdom, it is rather unreasonable to assume these refinements to have survived fortuitously. Driving this evolutionary process can be attributed to functions of the glycans ranging from purely physical aspects such as solubility or protection of surface against prote-olytic attack to any involvement in recognition (Drickamer and Taylor, 1998; Gagneux and Varki, 1999; Reuter and Gabius, 1999; Sharon and Lis, 1997; Varki, 1996).

A principal comment is warranted on the surmised evolutionary mechanisms of selection of letters for the alphabet of this code system. As insightfully discussed by Hirabayashi (1996), elementary hexose synthesis under prebiotic conditions was most probably facilitated by the following cascade. It started with formol condensation, yielding basic trioses known from glycolysis. The next step is the aldol condensation to 3,4-trans-ketoses and a conversion of D-fructose to D-glucose and D-mannose via an enediol-intermediate and the keto-enol tautomerism (Lobry de Bruyn rearrangement). Notably, D-glucose harbors no 1,3-diaxial interactions involving a hydroxyl group (Fig. 1), and the favored "tridymite" water structure is maintained in the presence of equatorial hydroxyl groups (Uedaira and Uedaira, 1985). In mannose as in galactose, a biochemical derivative obtained by the NAD+-dependent epimerization of glucose, only one hydroxyl group is axial, keeping unfavorable 1,3-diaxial interactions and perturbation of solvent structure minimal. In contrast to the 2'- and 4'-epimers, the 3'-epimer has two 1,3-axial interactions. Origin from synthesis under prebiotic conditions and energetic consequences entail the organization of the initial hardware of the sugar code. From them, further letters of the alphabet comprising also the N-acetyl derivatives of the 2'-amines of glucose and galactose, L-fucose, D-xylose, and N-acetylneuraminic acid are biosynthetically produced. Interestingly, the core section of N-glycans (Fig. 2) is composed of basic units derived from a presumably prebiotic origin. This fact invites to speculate on a relationship of evolutionary pathways on the levels of eukaryotic organisms and of glycan complexity. Setting this aspect which is further discussed elsewhere (Drickamer and Taylor, 1998; Gagneux and Varki, 1999; Hirabayashi, 1996; Oriol et al., 1999) aside in this context, it can at least be reliably concluded at this stage that oligosaccharides by their inherent potential for ample sequence permutations including variations in the anomeric position and the linkage groups for a glycosidic bond deserve attention as coding units. Remarkably, recent work extends the capacity for information storage from two dimensions of linear and branched oligosaccharide chains to the third dimension.

Was this article helpful?

0 0

Post a comment