7 June 2024 | The Long-read RNA-Seq Genome Annotation Assessment Project Consortium
The Long-read RNA-Seq Genome Annotation Assessment Project (LRGASP) evaluated the effectiveness of long-read approaches for transcriptome analysis. The consortium generated over 427 million long-read sequences from human, mouse, and manatee datasets using various protocols and sequencing platforms. The study found that longer, more accurate sequences produced more accurate transcripts, while greater read depth improved quantification accuracy. In well-annotated genomes, reference-based tools performed best, and incorporating additional data and replicate samples was advised for detecting rare and novel transcripts. The project validated many lowly expressed, single-sample transcripts, suggesting further exploration of long-read data for reference transcriptome creation. The evaluation metrics and scripts provided by LRGASP allowed for a fair and transparent benchmarking effort, highlighting the importance of comprehensive benchmarking in transcriptome analysis.The Long-read RNA-Seq Genome Annotation Assessment Project (LRGASP) evaluated the effectiveness of long-read approaches for transcriptome analysis. The consortium generated over 427 million long-read sequences from human, mouse, and manatee datasets using various protocols and sequencing platforms. The study found that longer, more accurate sequences produced more accurate transcripts, while greater read depth improved quantification accuracy. In well-annotated genomes, reference-based tools performed best, and incorporating additional data and replicate samples was advised for detecting rare and novel transcripts. The project validated many lowly expressed, single-sample transcripts, suggesting further exploration of long-read data for reference transcriptome creation. The evaluation metrics and scripts provided by LRGASP allowed for a fair and transparent benchmarking effort, highlighting the importance of comprehensive benchmarking in transcriptome analysis.