Inferring tumour purity and stromal and immune cell admixture from expression data

Inferring tumour purity and stromal and immune cell admixture from expression data

11 Oct 2013 | Kosuke Yoshihara¹², Maria Shahmoradgoli³, Emmanuel Martinez¹⁴, Rahulsimham Vegesna¹, Hoon Kim¹, Wandaliz Torres-Garcia¹, Victor Treviño⁴, Hui Shen⁵, Peter W. Laird⁵, Douglas A. Levine⁶, Scott L. Carter⁷, Gad Getz⁷, Katherine Stemke-Hale³, Gordon B. Mills³ & Roel G.W. Verhaak¹
The article introduces ESTIMATE, a method that uses gene expression data to infer the fraction of stromal and immune cells in tumour samples. ESTIMATE scores correlate with DNA copy number-based tumour purity across 11 tumour types, and have been validated using 3,809 transcriptional profiles. The method allows for the consideration of tumour-associated normal cells in genomic and transcriptomic studies. ESTIMATE is based on gene expression signatures for stromal and immune cells, and uses single-sample gene set-enrichment analysis (ssGSEA) to calculate scores that reflect the presence of each cell type in tumour samples. These scores are used to infer tumour purity. The method was validated using ovarian, breast, and lung cancer data, and showed a strong correlation with tumour purity. ESTIMATE was also applied to 10 TCGA tumour types, and showed good performance across different platforms and tumour types. The method provides a way to assess the presence of stromal and immune cells in tumour samples, and may be useful in understanding tumour biology and developing prognostic models. The ESTIMATE algorithm is publicly available through the SourceForge software repository.The article introduces ESTIMATE, a method that uses gene expression data to infer the fraction of stromal and immune cells in tumour samples. ESTIMATE scores correlate with DNA copy number-based tumour purity across 11 tumour types, and have been validated using 3,809 transcriptional profiles. The method allows for the consideration of tumour-associated normal cells in genomic and transcriptomic studies. ESTIMATE is based on gene expression signatures for stromal and immune cells, and uses single-sample gene set-enrichment analysis (ssGSEA) to calculate scores that reflect the presence of each cell type in tumour samples. These scores are used to infer tumour purity. The method was validated using ovarian, breast, and lung cancer data, and showed a strong correlation with tumour purity. ESTIMATE was also applied to 10 TCGA tumour types, and showed good performance across different platforms and tumour types. The method provides a way to assess the presence of stromal and immune cells in tumour samples, and may be useful in understanding tumour biology and developing prognostic models. The ESTIMATE algorithm is publicly available through the SourceForge software repository.
Reach us at info@study.space