An integrative genomics approach to infer causal associations between gene expression and disease

An integrative genomics approach to infer causal associations between gene expression and disease

2005 July | Eric E Schadt¹, John Lamb¹, Xia Yang², Jun Zhu¹, Steve Edwards¹, Debraj GuhaThakurta¹, Solveig K Sieberts¹, Stephanie Monks³, Marc Reitman⁴, Chunsheng Zhang¹, Pek Yee Lum¹, Amy Leonardson¹, Rolf Thieringer⁵, Joseph M Metzger⁶, Liming Yang⁸, John Castle¹, Haoyuan Zhu¹, Shera F Kash¹, Thomas A Drake⁸, Alan Sachs¹, and Aldons J Lusis²
This study presents an integrative genomics approach to infer causal associations between gene expression and disease. The method combines DNA variation and gene expression data with other complex trait data from segregating mouse populations to identify potential key drivers of complex traits. The approach involves systematically testing whether variations in DNA that lead to variations in relative transcript abundances statistically support an independent, causative, or reactive function relative to the complex traits under consideration. The method was validated using simulated data and applied to large-scale genotypic, gene-expression, and complex-trait data to identify genes involved in susceptibility to obesity. The study demonstrates that this approach can predict transcriptional responses to single gene perturbation experiments using gene-expression data in segregating populations. It also identifies and experimentally validates three new genes involved in obesity susceptibility. The method uses a likelihood-based causality model selection (LCMS) test to determine which relationship among traits is best supported by the data. The LCMS procedure was validated using simulated data and an experimental dataset where the relationship among expression traits was known. The study shows that DNA variation enhances the ability to order complex traits. The LCMS procedure was applied to identify causal genes for obesity in mice. The method identified 113 genes as the most significant candidates for the OFPM trait, with Hsd11b1 being one of the best candidates. The study also validated the role of Hsd11b1 in obesity by showing that its expression is significantly correlated with the OFPM trait in the BXD set. The study further validated the role of Hsd11b1 in obesity by showing that its activity levels and mRNA levels are significantly correlated with fat mass and insulin sensitivity in humans. The study also validated the role of Zfp90, C3ar1, and Tgfbr2 as causal genes for obesity. The study showed that these genes are significantly correlated with the OFPM trait in the BXD set and that their activity can lead to significant variation in the OFPM trait. The study also showed that Zfp90 may have an uncharacterized role in the regulation of obesity traits. The study further showed that Zfp90 is a central node in the liver transcriptional network and falls upstream of several key genes predicted to be causally associated with the OFPM trait. The study discusses the limitations of the LCMS procedure, including its dependency on measurement and modeling errors, and its inability to discriminate between highly correlated traits. The study also highlights the need for further statistical considerations in the analysis of high-dimensional data. Despite these limitations, the study concludes that the ability to partition genes into causal and reactive sets and identify those targets from the causal set that are optimally placed in the gene network associated with complex traits of interest offers a promising approach to understanding the complex network of gene changes associated with complex traits such as common human diseases.This study presents an integrative genomics approach to infer causal associations between gene expression and disease. The method combines DNA variation and gene expression data with other complex trait data from segregating mouse populations to identify potential key drivers of complex traits. The approach involves systematically testing whether variations in DNA that lead to variations in relative transcript abundances statistically support an independent, causative, or reactive function relative to the complex traits under consideration. The method was validated using simulated data and applied to large-scale genotypic, gene-expression, and complex-trait data to identify genes involved in susceptibility to obesity. The study demonstrates that this approach can predict transcriptional responses to single gene perturbation experiments using gene-expression data in segregating populations. It also identifies and experimentally validates three new genes involved in obesity susceptibility. The method uses a likelihood-based causality model selection (LCMS) test to determine which relationship among traits is best supported by the data. The LCMS procedure was validated using simulated data and an experimental dataset where the relationship among expression traits was known. The study shows that DNA variation enhances the ability to order complex traits. The LCMS procedure was applied to identify causal genes for obesity in mice. The method identified 113 genes as the most significant candidates for the OFPM trait, with Hsd11b1 being one of the best candidates. The study also validated the role of Hsd11b1 in obesity by showing that its expression is significantly correlated with the OFPM trait in the BXD set. The study further validated the role of Hsd11b1 in obesity by showing that its activity levels and mRNA levels are significantly correlated with fat mass and insulin sensitivity in humans. The study also validated the role of Zfp90, C3ar1, and Tgfbr2 as causal genes for obesity. The study showed that these genes are significantly correlated with the OFPM trait in the BXD set and that their activity can lead to significant variation in the OFPM trait. The study also showed that Zfp90 may have an uncharacterized role in the regulation of obesity traits. The study further showed that Zfp90 is a central node in the liver transcriptional network and falls upstream of several key genes predicted to be causally associated with the OFPM trait. The study discusses the limitations of the LCMS procedure, including its dependency on measurement and modeling errors, and its inability to discriminate between highly correlated traits. The study also highlights the need for further statistical considerations in the analysis of high-dimensional data. Despite these limitations, the study concludes that the ability to partition genes into causal and reactive sets and identify those targets from the causal set that are optimally placed in the gene network associated with complex traits of interest offers a promising approach to understanding the complex network of gene changes associated with complex traits such as common human diseases.
Reach us at info@study.space