Transcriptome and genome sequencing uncovers functional variation in humans

Transcriptome and genome sequencing uncovers functional variation in humans

2013 September 26 | Lappalainen et al.
This study reports the first uniformly processed RNA-seq data set from multiple human populations, derived from the 1000 Genomes Project. The data includes mRNA and miRNA sequencing from 462 individuals, enabling the identification of widespread genetic variation affecting gene regulation. The research reveals that transcript structure and expression levels are equally common but genetically independent. The study characterizes causal regulatory variation, providing insights into cellular mechanisms of regulatory and loss-of-function variation, and allows inference of putative causal variants for disease-associated loci. The analysis of transcriptome variation across populations shows that population differences explain a small but significant proportion of total variation. The study identifies 263–4379 genes with differential expression and/or transcript ratios between population pairs, with continental differences between YRI-EUR population pairs showing higher contribution of genes with different transcript usage. The study also quantifies 644 autosomal miRNAs in over 50% of individuals, showing that genetic effects on miRNA expression are more widespread than previously identified loci. The study identifies 3,773 genes with classical eQTLs for gene expression levels and 7,825 genes with eQTLs, indicating substantial allelic heterogeneity for regulatory effects on a single gene. The study also identifies 639 genes with transcript ratio QTLs (trQTLs) affecting transcript structure, with the lower number relative to gene eQTLs likely due to higher noise in model-based transcript quantifications. The study further shows that the causal variants are independent in 57% of genes, suggesting that transcriptional activity and transcript usage are usually controlled by different regulatory elements. The study also identifies regulatory variants that are enriched in noncoding elements from the Ensembl Regulatory Build, such as transcription factor peaks, DNase1 hypersensitive sites, and chromatin states of active promoters and strong enhancers. The study also shows that regulatory variants are enriched in splice sites and nonsynonymous sites, pointing to putative regulatory functions of coding variants. The study further shows that transcript ratio QTLs are overrepresented in splice sites and 3'UTRs, indicating the importance of these regions in regulatory functions. The study also identifies allelic and loss-of-function effects, showing that transcript differences between the two haplotypes of an individual allow quantification of regulatory variation even when eQTLs cannot be detected. The study also quantifies functional effects of predicted loss-of-function variants, showing that premature stop codon and splice-site variants have significant effects on transcriptome effects. The study also models how genetic variants affect splicing affinity in the entire splicing motif, showing that nonreference alleles have lower splicing affinity. The study concludes that integrated analysis of RNA and DNA sequencing data provides a unique view of transcriptome variation and its genetic causes, moving beyond eQTL catalogs to a high-resolution view of genetic regulatory variants. The study showsThis study reports the first uniformly processed RNA-seq data set from multiple human populations, derived from the 1000 Genomes Project. The data includes mRNA and miRNA sequencing from 462 individuals, enabling the identification of widespread genetic variation affecting gene regulation. The research reveals that transcript structure and expression levels are equally common but genetically independent. The study characterizes causal regulatory variation, providing insights into cellular mechanisms of regulatory and loss-of-function variation, and allows inference of putative causal variants for disease-associated loci. The analysis of transcriptome variation across populations shows that population differences explain a small but significant proportion of total variation. The study identifies 263–4379 genes with differential expression and/or transcript ratios between population pairs, with continental differences between YRI-EUR population pairs showing higher contribution of genes with different transcript usage. The study also quantifies 644 autosomal miRNAs in over 50% of individuals, showing that genetic effects on miRNA expression are more widespread than previously identified loci. The study identifies 3,773 genes with classical eQTLs for gene expression levels and 7,825 genes with eQTLs, indicating substantial allelic heterogeneity for regulatory effects on a single gene. The study also identifies 639 genes with transcript ratio QTLs (trQTLs) affecting transcript structure, with the lower number relative to gene eQTLs likely due to higher noise in model-based transcript quantifications. The study further shows that the causal variants are independent in 57% of genes, suggesting that transcriptional activity and transcript usage are usually controlled by different regulatory elements. The study also identifies regulatory variants that are enriched in noncoding elements from the Ensembl Regulatory Build, such as transcription factor peaks, DNase1 hypersensitive sites, and chromatin states of active promoters and strong enhancers. The study also shows that regulatory variants are enriched in splice sites and nonsynonymous sites, pointing to putative regulatory functions of coding variants. The study further shows that transcript ratio QTLs are overrepresented in splice sites and 3'UTRs, indicating the importance of these regions in regulatory functions. The study also identifies allelic and loss-of-function effects, showing that transcript differences between the two haplotypes of an individual allow quantification of regulatory variation even when eQTLs cannot be detected. The study also quantifies functional effects of predicted loss-of-function variants, showing that premature stop codon and splice-site variants have significant effects on transcriptome effects. The study also models how genetic variants affect splicing affinity in the entire splicing motif, showing that nonreference alleles have lower splicing affinity. The study concludes that integrated analysis of RNA and DNA sequencing data provides a unique view of transcriptome variation and its genetic causes, moving beyond eQTL catalogs to a high-resolution view of genetic regulatory variants. The study shows
Reach us at info@study.space
[slides and audio] Transcriptome and genome sequencing uncovers functional variation in humans