Improving RNA-Seq expression estimates by correcting for fragment bias

Improving RNA-Seq expression estimates by correcting for fragment bias

2011 | Adam Roberts, Cole Trapnell, Julie Donaghey, John L Rinn, Lior Pachter
This paper presents a method for improving RNA-Seq expression estimates by correcting for fragment bias. RNA-Seq library preparation leads to non-uniform distribution of cDNA fragments within transcripts, which affects expression estimation. The authors propose a likelihood-based approach to correct for these biases, which improves expression estimates as measured by correlation with qRT-PCR and enhances the replicability of results across libraries and sequencing technologies. The study shows that fragment biases, including positional and sequence-specific biases, can affect expression estimates. These biases are difficult to predict from protocols alone due to uncertainties in biochemical steps and RNA secondary structure. Instead, the authors infer biases indirectly from fragment alignments in an experiment. This approach is essential because expression estimates without bias correction can lead to over- or under-representation of fragments. The authors developed a likelihood-based method for simultaneous estimation of bias parameters and expression levels. This method improves expression estimates compared to previous approaches, as demonstrated by benchmarking against qRT-PCR data. The method is robust to various RNA-Seq protocols, including single- and paired-end reads, strand-specific and non-specific protocols, and different priming and fragmentation methods. The study also compares the method with previous approaches, such as Genominator and mseq, showing that their performance is inferior due to limitations in learning bias parameters from raw read counts. The authors' method, which jointly estimates bias and expression, provides more accurate results. The method was tested on various RNA-Seq datasets, including those from different sequencing technologies and protocols. It showed significant improvements in correlation with qRT-PCR, particularly for low-expression transcripts. The method also improved the consistency of results across technical replicates and different library preparations. The study highlights the importance of bias correction in RNA-Seq analysis, especially for differential expression studies. The method is implemented in the Cufflinks RNA-Seq analysis suite and is available as open-source software. The method is also applicable to other sequencing technologies, including SOLiD, and has been shown to improve expression estimates across different platforms. The authors conclude that bias correction is essential for accurate RNA-Seq expression estimation and should be applied to correct biases introduced during library preparation and sequencing. The method provides a robust framework for estimating expression levels and correcting biases in RNA-Seq data.This paper presents a method for improving RNA-Seq expression estimates by correcting for fragment bias. RNA-Seq library preparation leads to non-uniform distribution of cDNA fragments within transcripts, which affects expression estimation. The authors propose a likelihood-based approach to correct for these biases, which improves expression estimates as measured by correlation with qRT-PCR and enhances the replicability of results across libraries and sequencing technologies. The study shows that fragment biases, including positional and sequence-specific biases, can affect expression estimates. These biases are difficult to predict from protocols alone due to uncertainties in biochemical steps and RNA secondary structure. Instead, the authors infer biases indirectly from fragment alignments in an experiment. This approach is essential because expression estimates without bias correction can lead to over- or under-representation of fragments. The authors developed a likelihood-based method for simultaneous estimation of bias parameters and expression levels. This method improves expression estimates compared to previous approaches, as demonstrated by benchmarking against qRT-PCR data. The method is robust to various RNA-Seq protocols, including single- and paired-end reads, strand-specific and non-specific protocols, and different priming and fragmentation methods. The study also compares the method with previous approaches, such as Genominator and mseq, showing that their performance is inferior due to limitations in learning bias parameters from raw read counts. The authors' method, which jointly estimates bias and expression, provides more accurate results. The method was tested on various RNA-Seq datasets, including those from different sequencing technologies and protocols. It showed significant improvements in correlation with qRT-PCR, particularly for low-expression transcripts. The method also improved the consistency of results across technical replicates and different library preparations. The study highlights the importance of bias correction in RNA-Seq analysis, especially for differential expression studies. The method is implemented in the Cufflinks RNA-Seq analysis suite and is available as open-source software. The method is also applicable to other sequencing technologies, including SOLiD, and has been shown to improve expression estimates across different platforms. The authors conclude that bias correction is essential for accurate RNA-Seq expression estimation and should be applied to correct biases introduced during library preparation and sequencing. The method provides a robust framework for estimating expression levels and correcting biases in RNA-Seq data.
Reach us at info@study.space
[slides] Improving RNA-Seq expression estimates by correcting for fragment bias | StudySpace