Ii

Affected child

Partial trisomy chromosome 10q26.13->qter Partial monosomy chromosome 16p13.3-> pter

Figure 1.28 Example of translocation leading to partial trisomy and monosomy involving the a globin gene cluster. A balanced translocation affecting a mother between chromosome 16p13.3 and chromosome 10q26.13 leads to a child with partial monosomy of 16p13.3^pter and partial trisomy 10q26.13^qter; the child had a thalassaemia trait (Buckle et al. 1988).

is lethal during development and therefore monosomy of chromosome 16 is not seen among live born infants. However, individuals with loss of all or part of an X chromosome may survive, leading to Turner syndrome (Box 3.5). Trisomy of certain chromosomes may be compatible with survival, notably chromosomes 21 (Down syndrome) (Box 3.3), 18 (Edward syndrome), and 13 (Patau syndrome); possession of additional sex chromosomes can lead to specific syndromes such as Kleinfelter (XXY) (Box 3.4).

1.4 Diversity across the genome

1.4.1 Classifying genetic variation

Our journey across the variable landscape of the globin genes has served to highlight many different classes of genetic variation and the functional consequences this may have. Research into human genetic variation has a clear historical context which has been reflected in prevailing views about the relative importance of different forms of diversity that have been observed (Feuk et al. 2006a). Early reports involved large 'microscopic' structural genomic changes at a chromosomal level typically at least 3 Mb in size which changed the quantity or structure of chromosomes; such events were relatively rare but often associated with dramatic consequence for the individual concerned in terms of a clear observed phenotype (described in Chapter 3). Subsequently, DNA sequence level variation was highlighted with nucleotide substitutions, deletions, or insertions recognized, notably those that modulated coding DNA sequences to change the structure or function of encoded proteins.

The extent of such diversity, and of repetitive elements, became clearer as more sequencing information was generated with technological advances and increasingly automated high throughout collaborative studies culminating in the publication of the draft human genome sequence in 2001 (Lander et al. 2001; Venter et al. 2001). Appreciation of the extent of 'simple' nucleo-tide variation has continued apace with sophisticated analytical approaches and comprehensive cataloguing of diversity within and among human populations through projects such as the International HapMap Project (Section 9.2.4) (Frazer et al. 2007). There has also been a growing appreciation of the importance and frequency of intermediate scale structural genomic variation greater than 1 kb in size, in particular the high frequency of copy number variation through the development of techniques utilizing microarrays to perform comparative genomic hybridization (Section 4.2) (Redon et al. 2006).

To try to begin to make sense of the richness and nature of human genetic variation, clear definitions and nomenclature are essential together with some form of systematic approach to classification. Over the course of this chapter we have gone from single nucleotide changes through to major structural events involving chromosomal regions. This broadly follows a classification based on size described by Scherer and Lee (Scherer et al. 2007) which serves as an important framework within which to understand the different classes of variation (Fig. 1.29).

The coming chapters will serve to illustrate and describe these different classes through detailed discussion of a number of specific examples. In Chapter 3,

Single nucleotide

• substitutions

'indels'

• VNTRs: microsatellites, minisatellites

• inversions

• di-, tri-, tetranucleotide repeats

1 kb to submicroscopic

• copy number variants

• segmental duplications

• inversions, translocations

• copy number variant regions

• microdeletions, microduplications

Microscopic to subchromosomal

• segmental aneusomy

• chromosomal deletions (losses)

• chromosomal insertions (gains)

• chromosomal inversions

• intrachromosomal translocations

• chromosomal abnormality

• heteromorphisms

• fragile sites

Whole chromosomal to whole genome

• interchromosomal translocations

• ring chromosomes, isochromosomes

• marker chromosomes

• aneuploidy

Figure 1.29 Classes of genomic variation. Redrawn and reprinted by permission from Macmillan Publishers Ltd: Nature Genetics (Scherer et al. 2007), copyright 2007.

chromosomal level variation is reviewed including gain or loss of whole chromosomes, translocations and chromosome rearrangements, inversions, and other structural variation detectable at a microscopic (cyto-genetic) level. Submicroscopic structural variation is then discussed in terms of copy number variation among healthy individuals and its role in susceptibility to disease (Chapter 4) before considering pathogenic copy number variation and genomic disorders (Chapter 5). Segmental duplications are reviewed in Chapter 6 with

Sequence variation

Structural variation diverse implications for our understanding of population genetics, evolution, and disease risk. Tandemly repeated DNA is reviewed in Chapter 7, including satellite, minisatellite, and microsatellite repeats. Mobile DNA elements are discussed in Chapter 8 before a review of sequence level diversity including single nucleotide polymorphisms in Chapter 9. Further examples of these and other variants are then considered in the remaining chapters of the book focused on evidence of selection (Chapter 10), effects on gene expression (Chapter 11), and specific genomic regions (Chapter 12) or diseases (Chapters 13 and 14). Before concluding this chapter, this seems an appropriate point to conclude a story begun earlier when DNA sequencing technologies were introduced (Box 1.10). The fruits of such work were to enable the remarkable feat of sequencing the human genome, and in so doing to provide a reference sequence for comparing and annotating genetic variation, as well as uncovering many new sequence variants which served to highlight the extent of diversity within our genomes.

1.4.2 Sequencing the human genome

In 2001 draft sequences for the human genome were published (Lander et al. 2001; Venter et al. 2001). We now have a detailed route map of our genome which is pub-lically available and finished to a high degree of accuracy and coverage (IHGSC 2004), although work remains ongoing to close the final gaps in the sequence (Cole et al. 2008). In terms of understanding human genetic variation, such studies have been of fundamental importance. They established a reference human genome sequence to which sequence variants can be mapped and compared, most recently updated as the February 2009 human reference sequence (GRCh37) produced by the Genome Reference Consortium, (http://genome. ucsc.edu/cgi-bin/hgGateway?org=Human&db=hg19). However, this is not the sequence of a single individual but rather a composite of DNA sequence derived from different people reflecting the hierarchical approach to mapping and sequencing used in the Human Genome Project.

The Human Genome Project was launched in 1990, with publication of the draft sequence in 2001

(Lander et al. 2001). It was the result of an international consortium involving 20 scientific laboratories from the United States, United Kingdom, Japan, France, Germany, and China; a remarkable collaborative effort that resulted in the successful sequencing of the human genome, the first vertebrate genome to be extensively sequenced. Over the course of the project sequence data were publically available and updated daily. In the same year a draft sequence was published by Celera Genomics, a private company (Venter et al. 2001).

Both sequences were based on 'shotgun' sequencing using the Sanger dideoxy method (Box 1.10): 'shotgun' refers to the fact that a library of clones is sequenced prepared from randomly fragmented genomic DNA. Celera adopted a whole genome shotgun sequencing strategy while the International Human Genome Sequencing Consortium used a hierarchical shotgun sequencing approach in which clones containing large inserted fragments of genomic DNA are selected to generate an overlapping set of mapped segments of DNA (Fig. 1.30). This facilitates final assembly of sequence and was felt particularly important given the number of repeats, but carried increased financial cost. A given large insert clone was derived from a single haplotype (the combination of genetic markers or alleles found in a specific region of a single chromosome of a given individual); the need to sequence overlaps between clones generated significant amounts of data on sequence diversity given the multiple individuals from whom libraries and hence clones were derived. The introduction of large insert cloning systems such as bacterial artificial chromosomes (BACs) was important in enabling such work (Shizuya et al. 1992).

Sequencing of a complete chromosome from yeast in 1992 (Oliver et al. 1992) and an extensive sequence from the nematode worm in 1994 (Wilson et al. 1994) highlighted the feasibility of large scale sequencing. Pilot projects to assess feasibility and define approaches to sequencing the human genome were completed in 1999, together with some 15% of the human genome sequence. This led on to a 'full scale production' phase: the sequence of the two smallest chromosomes in the genome, chromosomes 22 and 21 were published in 1999 and 2000, respectively (Dunham et al. 1999; Hattori et al. 2000); the draft sequence for the entire genome was published in 2001 (Lander et al. 2001).

Random fragmentation of genomic DNA

Cloning into large fragment cloning vector to generate library (in this example a BAC library)

Mapping and organization of large insert clone contigs

Selection of individual BAC to be sequenced

Shotgun clones

Shotgun sequence

Assembly to reconstruct genome sequence

0 0

Post a comment