The Subread aligner: fast, accurate and scalable read mapping by seed-and-vote

The Subread aligner: fast, accurate and scalable read mapping by seed-and-vote

2013 | Yang Liao, Gordon K. Smyth and Wei Shi
The Subread aligner is a fast, accurate, and scalable read mapping method that uses a seed-and-vote strategy. This approach involves extracting short subreads from each read and allowing them to vote for the optimal genomic location. The strategy is efficient because the overall genomic location is determined before detailed alignment, and it is sensitive because individual subreads do not need to map exactly. It is accurate because the final location must be supported by multiple subreads. The method is also effective for detecting exon junctions and scales well for longer reads. The seed-and-vote strategy involves extracting overlapping subreads from each read, which are then used to determine the optimal mapping location. This approach is more efficient than conventional alignment methods, which often require extensive backtracking and dynamic programming. The strategy is particularly effective for detecting indels, as the presence of indels can be determined by comparing the positions of flanking subreads. Subread is compared to other aligners such as Bowtie2, BWA, Maq, and Novoalign. It is found to be significantly faster, with Subread being nearly four times faster than Bowtie2. It also provides high accuracy, with results that are closer to the true fold changes than any other aligner. Subread is also effective in detecting indels, with a high recall rate and accuracy. The strategy is also effective in detecting exon-exon junctions, as it uses the seed-and-vote approach to identify potential junctions and validate them through a two-scan procedure. This procedure involves scanning the reads twice, with the first scan identifying potential junctions and the second scan validating them. The results are then used to generate a list of validated junction locations. Subread is also effective in recovering spiked-in expression levels, as it accurately maps reads to spike-in transcripts and computes fold changes that are close to the true values. The strategy is suitable for use in genomic variation detection and whole genome expression profiling, as it is fast, accurate, and scalable.The Subread aligner is a fast, accurate, and scalable read mapping method that uses a seed-and-vote strategy. This approach involves extracting short subreads from each read and allowing them to vote for the optimal genomic location. The strategy is efficient because the overall genomic location is determined before detailed alignment, and it is sensitive because individual subreads do not need to map exactly. It is accurate because the final location must be supported by multiple subreads. The method is also effective for detecting exon junctions and scales well for longer reads. The seed-and-vote strategy involves extracting overlapping subreads from each read, which are then used to determine the optimal mapping location. This approach is more efficient than conventional alignment methods, which often require extensive backtracking and dynamic programming. The strategy is particularly effective for detecting indels, as the presence of indels can be determined by comparing the positions of flanking subreads. Subread is compared to other aligners such as Bowtie2, BWA, Maq, and Novoalign. It is found to be significantly faster, with Subread being nearly four times faster than Bowtie2. It also provides high accuracy, with results that are closer to the true fold changes than any other aligner. Subread is also effective in detecting indels, with a high recall rate and accuracy. The strategy is also effective in detecting exon-exon junctions, as it uses the seed-and-vote approach to identify potential junctions and validate them through a two-scan procedure. This procedure involves scanning the reads twice, with the first scan identifying potential junctions and the second scan validating them. The results are then used to generate a list of validated junction locations. Subread is also effective in recovering spiked-in expression levels, as it accurately maps reads to spike-in transcripts and computes fold changes that are close to the true values. The strategy is suitable for use in genomic variation detection and whole genome expression profiling, as it is fast, accurate, and scalable.
Reach us at info@study.space
[slides] The Subread aligner%3A fast%2C accurate and scalable read mapping by seed-and-vote | StudySpace