-Chicken 2

gene duplications

I-Zebrafish 4

Lamprey 1

Amphioxus 1

Sea Urchin 1

the absence of a gene in an extant genome reflects gene loss or whether it indicates that the gene never existed in that animal lineage. For example, the first engrailed duplication in the vertebrate lineage could have occurred before lamprey diverged, with one copy subsequently being lost in the lamprey lineage. If this scenario is correct, the real timing of the engrailed duplication event probably lies deeper within the tree, at the time of large-scale genomic duplications at the base of the vertebrate lineage.

Gene divergence

A gene duplication event may initially generate two redundant gene copies if both the coding sequences and cis-regulatory control regions are duplicated. Redundancy can allow one gene copy to be rapidly lost; indeed, gene loss may be a common result following gene duplication. Physical gene loss can occur because of chromosomal deletions, or a duplicated gene can be functionally lost because of the accumulation of deleterious mutations (becoming a pseudogene). Large-scale genomic duplications, including tetraploidization, generate large numbers of redundant genes that are not always maintained, as reflected in the loss of some genes from vertebrate Hox clusters (Fig. 4.3). Nonetheless, large gene families of paralogous genes have evolved due to the evolutionary fixation of duplication events. Why, then, are duplicated genes retained within a genome?

Many duplicated genes persist because of functional divergence between the two paralogs. Paralogs can diverge through changes in their coding sequences that lead to differences in protein function, and they may accumulate changes in their cis-regulatory elements that generate differences in the timing or pattern of gene expression during development (Fig. 4.6). Duplicated genes do not have to evolve new functions to be retained, if ancestral functions are partitioned, or subfunctionalized, between the duplicate copies. Indeed, the modular

Figure 4.6

Mechanisms of gene divergence

Figure 4.6

Mechanisms of gene divergence

Gene duplication can create two identical copies of a gene, including both c/s-regulatory regions (blue, red, and green shapes) and the coding sequence (purple rectangle). These duplicated genes can functionally diverge over time in several ways. The coding sequences will accumulate changes (indicated in black), which may alter protein function. The ancestral function may be retained by both proteins split between them, or retained by a single copy, freeing the other to evolve new functions. In addition, c/s-regulatory regions may evolve, with separate enhancers acting as independent modules. Over time, an enhancer may be lost in one duplicate, such that the other copy retains that portion of the ancestral expression pattern (indicated by loss of the red square and the green triangle). The duplicated genes can also diverge if new enhancers evolve in one copy (indicated by the black star). In this way, evolution allows the ancestral function of a gene, including both protein function and c/s-regulatory control, to be shared and even split between duplicated copies, even as new functions and expression domains continue to evolve.

nature of c/s-regulatory elements may facilitate the partitioning of ancestral expression domains. For example, one gene copy may lose an element that is retained in the duplicate, whereas the other gene copy loses a different element. In addition, new elements may be individually gained to produce novel expression patterns.

One example of duplicated genes that have diverged primarily in their regulation involves the Drosophila gooseberry/gooseberry-neuro gene pair. These linked genes encode functionally redundant Pax family transcription factors, yet are expressed in different tissues in early development. The gooseberry gene, which is one of the Drosophila segment polarity genes, is expressed in stripes in the early embryo. By contrast, gooseberry-neuro is expressed at later stages, in the developing nervous system. The expression patterns of these related genes are controled by different cis-regulatory elements. Changes in cis-regulation represent the primary evolutionary difference between gooseberry and gooseberry-neuro.

Of course, genes that have not experienced a recent duplication event are also susceptible to change due to mutation. Thus, orthologs in different lineages (and even within a population) also undergo diversification. Changes between orthologs are most evident in sequence differences, though the vast majority of these differences do not apparently alter protein function. The constraint of maintaining a gene's function most often results in stabilizing selection. Divergence in protein function of orthologs is possible, however, in cases of positive selection for a new function, or when interacting proteins coevolve in different lineages.

Assembly of the toolkit: thefirst animals

The growing list of genes shared by mice and flies (see Chapter 2 and Tables 2.1 & 2.2) and other bilaterians reveals that their common ancestor had an extensive toolkit of developmental genes. Basal animal lineages, including the diploblast phyla (Cnidaria, Ctenophora) and the Porifera (sponges), have much less developmental and morphological complexity than do bilaterians (Fig. 4.1). Does their simpler body organization reflect a smaller complement of toolkit genes? Or did the bilaterian toolkit predate the origin and radiation of animals all together?

We can track the assembly of the toolkit for animal development by comparing the genes shared among bilaterian phyla to the genes of cnidarians, sponges, and even other eukaryotic organisms. The genome projects for the yeast S. cerevisiae and the flowering plant Ara-bidopsis, in addition to the C. elegans, Drosophila, and various deuterostome species, have generated data that facilitates the direct comparison of fungal, plant, and animal genomes.

Many protein domains that are characteristic of animal transcription factors or signaling molecules are found in yeast and plants, and thus have ancient origins predating multicellu-lar life (Table 4.1). For example, homeodomains are present in all multicellular organisms, indicating that this protein motif is older than animals themselves. Other domains found in proteins that are crucial to animal development, however, have not been identified outside the animal kingdom. For example, TGF-P and Wnt signaling molecules, in addition to several families of transcription factors, have been found only in animals (Table 4.1). Thus some toolkit genes evolved from ancient genes, whereas others appeared early in the animal lineage.

The set of genes present in the first animals may also be inferred by the complement of genes found in choanoflagellates, single-celled protozoa thought to be the closest outgroup to the metazoan clade, as well as basal animals (Cnidaria, Porifera). For example, choano-flagellates have genes involved in cell signaling and adhesion in animals, including tyrosine kinase signaling components, indicating that these genes predated the origin of animal multicellularity. Additional toolkit genes appear in the genomes of cnidarians, including components of the Wnt and TGF-P signaling pathways and many transcription factor families (Table 4.1). Cnidarians have several homeodomain-containing genes that are homologous to bilaterian toolkit genes (including even-skipped, engrailed, Distal-less, and Hox genes) and at least four members of the Pax family of transcription factors, several T-box genes (including Brachyury), a Snail homolog (zinc finger), a Twist homolog (helix-loop-helix), and a mef2 (MADS-box) homolog. Clearly, at least some members of the bilaterian toolkit were present in the earlier animal lineages.

Although cnidarian genomes contain many of the gene families that contribute to the toolkit, cnidarians appear to have a relatively small number of toolkit genes. For example, bilaterians have at least one additional Pax gene and many more Hox genes than are found in cnidarians. A comparison of the toolkit genes shared among bilaterians with the genes present in cnidarians indicates that the bilaterian toolkit expanded through gene duplication after the divergence of the cnidarian lineage but before the radiation of bilaterian phyla.

TABLE 4.1 Number of genes in shared transcription factor and signaling pathway gene families


TABLE 4.1 Number of genes in shared transcription factor and signaling pathway gene families


Protein domain

Fungi S. cerevisiae

Cnidaria and Porifera*

C. elegans



DNA Binding

Was this article helpful?

0 0

Post a comment