2019 | Sam Kovaka, Aleksey V. Zimin, Geo M. Pertea, Roham Razaghi, Steven L. Salzberg, Mihaela Pertea
StringTie2 is a reference-guided transcriptome assembler that works with both short and long reads. It improves the accuracy and speed of transcriptome assembly by handling the high error rate of long reads and using full-length super-reads assembled from short reads. StringTie2 outperforms existing tools in accuracy, speed, and memory usage. It uses super-reads to enhance sensitivity and precision, allowing it to assemble long reads more accurately and efficiently than other tools like FLAIR. StringTie2 can identify novel transcripts from long-read data even without a reference annotation. It is more accurate than Scallop on both short and long-read datasets. StringTie2 also performs better than FLAIR in terms of speed and memory usage. It is multithreaded, allowing it to run efficiently on multi-processor systems. StringTie2 is capable of assembling both short and long reads, as well as full-length super-reads. It is implemented in C++ and is freely available as open-source software under the MIT license. The tool was tested on various datasets, including human, Arabidopsis thaliana, and Zea mays, demonstrating its effectiveness in transcriptome assembly. StringTie2 is particularly useful for long-read RNA-seq data, where it provides more accurate transcript assembly due to its ability to handle high error rates and improve the accuracy of splice site identification. It also reduces the need for separate error correction steps by incorporating consensus calling. The tool is designed to be efficient and user-friendly, with no dependencies and a single command interface. StringTie2 is a significant improvement over previous versions and is recommended for use in transcriptome assembly.StringTie2 is a reference-guided transcriptome assembler that works with both short and long reads. It improves the accuracy and speed of transcriptome assembly by handling the high error rate of long reads and using full-length super-reads assembled from short reads. StringTie2 outperforms existing tools in accuracy, speed, and memory usage. It uses super-reads to enhance sensitivity and precision, allowing it to assemble long reads more accurately and efficiently than other tools like FLAIR. StringTie2 can identify novel transcripts from long-read data even without a reference annotation. It is more accurate than Scallop on both short and long-read datasets. StringTie2 also performs better than FLAIR in terms of speed and memory usage. It is multithreaded, allowing it to run efficiently on multi-processor systems. StringTie2 is capable of assembling both short and long reads, as well as full-length super-reads. It is implemented in C++ and is freely available as open-source software under the MIT license. The tool was tested on various datasets, including human, Arabidopsis thaliana, and Zea mays, demonstrating its effectiveness in transcriptome assembly. StringTie2 is particularly useful for long-read RNA-seq data, where it provides more accurate transcript assembly due to its ability to handle high error rates and improve the accuracy of splice site identification. It also reduces the need for separate error correction steps by incorporating consensus calling. The tool is designed to be efficient and user-friendly, with no dependencies and a single command interface. StringTie2 is a significant improvement over previous versions and is recommended for use in transcriptome assembly.