Understanding Integrating diverse genomic data using gene sets

This article introduces and evaluates methods for integrating multiple genomic features measured on the same biological samples using gene sets. The authors demonstrate that their approach can detect genetic effects acting through different mechanisms in different samples and discover and validate disease-related gene sets that would not be identified by analyzing each data type individually. They compare two integration approaches: one based on model-based gene-to-phenotype association scores and another using meta-analytical methods. The results, obtained from glioblastoma multiforme (GBM) data from The Cancer Genome Atlas (TCGA), show that the integrative approach can identify metabolic processes, stress pathways, and Wnt pathways associated with survival. Independent validation using data from the NCI Repository for Molecular Brain Neoplasia Data (Rembrandt) confirms these findings. Simulations further validate the approach, showing that it outperforms single-data-type analyses and meta-analytical methods in detecting gene sets with varying signal strengths and fractions of altered genes. The authors conclude that their integrative approach provides a robust and flexible method for integrating diverse genomic data, particularly when genes within a set are altered by different mechanisms.This article introduces and evaluates methods for integrating multiple genomic features measured on the same biological samples using gene sets. The authors demonstrate that their approach can detect genetic effects acting through different mechanisms in different samples and discover and validate disease-related gene sets that would not be identified by analyzing each data type individually. They compare two integration approaches: one based on model-based gene-to-phenotype association scores and another using meta-analytical methods. The results, obtained from glioblastoma multiforme (GBM) data from The Cancer Genome Atlas (TCGA), show that the integrative approach can identify metabolic processes, stress pathways, and Wnt pathways associated with survival. Independent validation using data from the NCI Repository for Molecular Brain Neoplasia Data (Rembrandt) confirms these findings. Simulations further validate the approach, showing that it outperforms single-data-type analyses and meta-analytical methods in detecting gene sets with varying signal strengths and fractions of altered genes. The authors conclude that their integrative approach provides a robust and flexible method for integrating diverse genomic data, particularly when genes within a set are altered by different mechanisms.

Integrating diverse genomic data using gene sets

2011 | Svitlana Tyekucheva, Luigi Marchionni, Rachel Karchin and Giovanni Parmigiani