Gene ontology analysis for RNA-seq: accounting for selection bias

Gene ontology analysis for RNA-seq: accounting for selection bias

2010 | Matthew D Young, Matthew J Wakefield, Gordon K Smyth and Alicia Oshlack
The paper introduces GOseq, a method for performing Gene Ontology (GO) analysis on RNA-seq data, addressing the issue of selection bias that standard GO analysis methods suffer from. This bias arises because longer and more highly expressed transcripts are more likely to be detected as differentially expressed, leading to biased results. The authors propose a three-step methodology to correct for this bias: identifying differentially expressed genes, quantifying the likelihood of differential expression as a function of transcript length, and incorporating this information into the statistical test of category significance. The method is validated using a prostate cancer dataset, showing that it produces results more consistent with known biology and previous microarray studies. The paper also discusses the impact of total read count bias and compares the performance of GOseq with other methods, demonstrating its superior ability to recover well-established microarray results. The GOseq software is freely available and can be used to perform GO analysis on RNA-seq data, improving the biological relevance of the results.The paper introduces GOseq, a method for performing Gene Ontology (GO) analysis on RNA-seq data, addressing the issue of selection bias that standard GO analysis methods suffer from. This bias arises because longer and more highly expressed transcripts are more likely to be detected as differentially expressed, leading to biased results. The authors propose a three-step methodology to correct for this bias: identifying differentially expressed genes, quantifying the likelihood of differential expression as a function of transcript length, and incorporating this information into the statistical test of category significance. The method is validated using a prostate cancer dataset, showing that it produces results more consistent with known biology and previous microarray studies. The paper also discusses the impact of total read count bias and compares the performance of GOseq with other methods, demonstrating its superior ability to recover well-established microarray results. The GOseq software is freely available and can be used to perform GO analysis on RNA-seq data, improving the biological relevance of the results.
Reach us at info@study.space