A recurring theme over the course of this book is the relationship between genetic variation - in all its different guises, from single nucleotide substitutions to large scale structural genomic variation - and the observed pheno-type. At the level of the whole organism, the phenotype may be a discrete or continuous trait, which may or may not show mendelian segregation within pedigrees, and in the case of complex traits is the result of multiple genetic and environmental determinants. Over the course of Chapters 2 and 9, approaches to dissecting the genetic factors underlying 'mendelian' and 'complex' traits have been reviewed and the difficulties inherent to such analyses discussed. Even with powerful linkage-based analysis and positional cloning, or more recently the application of genome-wide association analysis to common disease, major roadblocks remain in terms of fine mapping associations and defining specific functional causative variants.
Inherent to the relationship between genetic variation and the observed phenotype is the hypothesis that the underlying genetic diversity has a functional consequence, acting at a molecular or cellular level with potentially diverse pathways and networks involved. Early attention focused on nonsynonymous and other coding variants that had direct consequences for the structure and function of the encoded proteins (Cargill et al. 1999). This was exemplified by studies investigating the genetic basis of structural variants of haemoglobin and the thalas-saemias (Section 1.3) together with many other examples described over the course of this book. Increasingly sophisticated tools and techniques are available to predict and test the functional consequences of coding variants taking into account the sequence, structure, and pathways involved (Chasman and Adams 2001; Uzun et al. 2007). Assigning functionality to such variants remains, however, a complex task and putative regulatory effects of exonic variation, for example involving exonic splicing enhancers and regulation of splicing, need to be considered (Section 11.6.1) (Cartegni et al. 2002).
With time it has become apparent, notably with the recent associations from genome-wide analyses in common disease but also in many other contexts including variation in the globin genes, that 'regulatory' variants modulating gene expression play an equally important role in determining phenotypic traits. Regulatory variants are typically located in non-coding DNA sequences although, as noted in the preceding paragraph, the categorization into coding and regulatory variants is in some respects artificial, and structural genomic variation such as copy number variation has clear gene dosage effects on gene expression (Section 4.3). Many examples of regulatory variants are given in this book, ranging from tandem repeats upstream of INS encoding insulin modulating gene expression (Section 7.3.3) to single nucleotide polymorphisms (SNPs) modulating the binding sites of specific transcription factors such as DARC with dramatic consequences for expression of the encoded receptor and malaria susceptibility (Section 13.2.4), or creating a new promoter as seen in thalassaemia involving the a globin genes (Section 1.3.9).
The challenge of how to define the extent of regulatory variation and identify specific causative variants remains a major roadblock in the field with relatively few tools available. Major advances have been made relatively recently through analysis of the genetics of gene expression. Here the phenotype of interest is far closer, in terms of pathways and networks, to the underlying genetic diversity as it is the transcriptome, the transcribed RNA, which is being quantified and analysed in relation to genetic variation. Consideration of gene expression as a quantitative trait in principle removes much of the inherent noise and variability associated with analysing a phenotype at the level of the whole organism, which results from a complex interplay between multiple genetic, environmental, and other factors.
The power of genetic analysis to resolve genetic variation modulating gene expression should in principle be considerably greater than for interrogating phenotypic traits for the whole organism. However, it was not clear at the outset to what extent gene expression varies between individuals and populations, whether such traits are heritable, if multiple genetic loci would be involved, or what the effect sizes would be. Over the course of this chapter advances in genetic analysis of gene expression are discussed and illustrated with work from model organisms and human populations, with significant implications for our ability to resolve susceptibility to common disease and other traits (Dermitzakis 2008).
Was this article helpful?