The genetical genomics approach was also successfully applied to human populations and showed the potential great utility of this approach in advancing our understanding of the genetic basis of common complex traits. Initial studies utilized lymphoblastoid cell lines (Box 11.1) and applied linkage-based analysis (Monks et al. 2004; Morley et al. 2004). For example, Morley and colleagues analysed gene expression for cell lines established from 14 CEPH families comprising grandparents, parents, and children (the latter on average comprising eight siblings per family) (Morley et al. 2004). The most variably expressed genes among the cell lines established from the 94 grandparents were selected (3554 of 8500 genes assayed) and genome-wide linkage analysis was performed using 2756 autosomal SNPs. Depending on the level of genome-wide significance chosen, 142 expression phenotypes (P = 0.001) or 984 phenotypes (P = 0.05) showed significant linkage (Morley et al. 2004). Again, the distinction between likely cis- and trans-acting loci was based on a distance from the target gene, here linked SNPs present less than 5 Mb from the gene were classed as cis-acting effects. This classified 19% as cis-acting only, and 77.5% as trans-acting only. However, the classification refers more specifically to the relative proportion of local and distant regulatory variation and determination of whether they are cis- or trans-acting remained undefined. Most eQTLs were thus distant and considered to be trans-acting, with two hotspots defined at chromosome 14q32 and chromosome 20q13, regulating expression of seven and six gene expression traits, respectively. Genotyping additional SNPs helped confirm cis-acting loci through association and analysis of differential allelic expression, the latter defined for example an eight-fold difference in expression of PSPHL between alleles for SNP rs6700 (Morley et al. 2004).
Monks and colleagues also studied lymphoblastoid cell lines established from CEPH families, performing linkage analysis for 2430 differentially expressed genes for 167 individuals in 15 families (Monks et al. 2004). They defined significant eQTLs for 33 genes with a P value of less than 0.000005 or 22 genes at genome-wide significance using a conservative Bonferroni correction. Unlike in yeast, there did not appear to be evidence of hotspots of linkage with particular QTLs controlling expression of multiple genes. The investigators noted their study was powered only to detect major eQTLs and that those identified for the 33 genes explained more than 50% of the observed variance in gene expression (Monks et al. 2004).
In 2005 Cheung and colleagues published data showing how genetic association rather than linkage could be successfully employed for such studies, and in particular the power of a genome-wide association approach (Cheung et al. 2005). The study was comparatively modest in size, comprising 57 lymphoblastoid cell lines established from individuals in CEPH pedigrees that were included in the CEU panel of European ancestry in the International HapMap Project (Box 9.1). The investigators followed on from their earlier linkage analysis in which they had defined gene expression phe-notypes with evidence of cis-acting eQTLs (Morley et al. 2004) and this allowed the utility of the association approach to now be assessed. Genetic association using dense SNP sets within 50 kb of target genes showed overlap of association with linkage, for 65 out of 374 phenotypes analysed, with a narrower window defined by association compared to linkage peaks (Fig. 11.4) (Cheung et al. 2005).
For genome-wide SNP analysis, 27 phenotypes with the strongest evidence of cis-acting eQTLs from linkage analysis were analysed using 770 394 SNP markers. This allowed regression analysis of gene expression by marker genotype, and defined evidence of association at a genome-wide level of significance for 14 out of 27 phenotypes (P < 6.7 X 108) (Cheung et al. 2005). The same region was defined by linkage and genome-wide association for 15 of 27 phenotypes. The study illustrated the power of applying genome-wide association to gen-etical genomics even with this modest sample size. The investigators also demonstrated for a specific marker SNP showing significant association in the CHI3L2 gene (encoding chitinase 3-like 2) promoter that the nucleotide substitution modulated reporter gene activity, and allele-specific gene expression based on the haploChIP approach (Section 11.5.3).
The value of the association-based approach was shown in the same year by Stranger and colleagues, who also analysed lymphoblastoid cell lines established from unrelated individuals in the CEU HapMap panel - in this case 60 lymphoblastoid cell lines - for 630 genes from the ENCODE (Section 9.2.4) region using 73 712 common SNPs (present at greater than 5% minor allele frequency) (Stranger et al.
2005). Again, even with a modest sample size and gene set, the study was successful in defining up to 40 genes associated with significant local, likely c/s-acting, eQTLs.
In 2007, Stranger and colleagues published data that built on this work by taking advantage of the denser SNP marker sets now available as part of Phase II of the HapMap Project (Section 9.2.4) (Stranger et al. 2007b). They analysed 270 lymphoblastoid cell lines, established from four different populations based on geographic ancestry of populations (Frazer et al. 2007). The analysis of cell lines from different human populations provided an important opportunity to assess how reproducible observed associations were between populations. The study was further facilitated by the availability of whole genome expression arrays including some 47 294 probes which, combined with high density, genome-wide marker SNP coverage, allowed for a comprehensive analysis of genome-wide association. Study power was, however, limited by the number of cell lines
Was this article helpful?