Vol. 29 no. 1 2013, pages 15-21 | Alexander Dobin1,* Carrie A. Davis1, Felix Schlesinger1, Jorg Drenkow1, Chris Zaleski1, Sonali Jha1, Philippe Batut1, Mark Chaisson2 and Thomas R. Gingeras1
The paper introduces STAR (Spliced Transcripts Alignment to a Reference), a novel RNA-seq alignment algorithm designed to address the challenges of aligning non-contiguous transcript sequences. STAR uses a sequential maximum mapable seed search in uncompressed suffix arrays followed by seed clustering and stitching, achieving significantly higher mapping speed and accuracy compared to other aligners. It can align large datasets, such as the ENCODE Transcriptome RNA-seq dataset, with high sensitivity and precision. STAR is capable of detecting both canonical and non-canonical splices, as well as chimeric transcripts, and can align full-length RNA sequences. Experimental validation using Roche 454 sequencing of reverse transcription polymerase chain reaction amplicons confirmed the high precision of STAR's splicing detection algorithm. STAR is implemented as a standalone C++ code, distributed under the GPLv3 license, and is available for download.The paper introduces STAR (Spliced Transcripts Alignment to a Reference), a novel RNA-seq alignment algorithm designed to address the challenges of aligning non-contiguous transcript sequences. STAR uses a sequential maximum mapable seed search in uncompressed suffix arrays followed by seed clustering and stitching, achieving significantly higher mapping speed and accuracy compared to other aligners. It can align large datasets, such as the ENCODE Transcriptome RNA-seq dataset, with high sensitivity and precision. STAR is capable of detecting both canonical and non-canonical splices, as well as chimeric transcripts, and can align full-length RNA sequences. Experimental validation using Roche 454 sequencing of reverse transcription polymerase chain reaction amplicons confirmed the high precision of STAR's splicing detection algorithm. STAR is implemented as a standalone C++ code, distributed under the GPLv3 license, and is available for download.