A survey of sequence alignment algorithms for next-generation sequencing

A survey of sequence alignment algorithms for next-generation sequencing

11 May 2010 | Heng Li and Nils Homer
The article provides a comprehensive review of sequence alignment algorithms for next-generation sequencing (NGS) technologies. It highlights the rapid evolution of NGS, which has led to the production of vast amounts of data, necessitating efficient sequence alignment methods. The review covers the development of various alignment algorithms, including those based on hash tables, suffix trees, and merge sorting, with a focus on their practical applications and improvements. Key advancements include the use of spaced seeds, q-gram filters, and multiple seed hits to enhance sensitivity and accuracy. The article also discusses the importance of gapped alignment and paired-end mapping in variant discovery and the role of base quality scores in improving alignment accuracy. Additionally, it explores the challenges and solutions for aligning long reads, SOLiD reads, bisulfite-treated reads, and spliced reads. The authors conclude that short-read alignment is no longer a bottleneck in data analysis, and future developments will focus on long-read alignment, de novo assembly, and multi-genome alignment. The article also touches on the potential of cloud computing to address data sharing and computational resource limitations.The article provides a comprehensive review of sequence alignment algorithms for next-generation sequencing (NGS) technologies. It highlights the rapid evolution of NGS, which has led to the production of vast amounts of data, necessitating efficient sequence alignment methods. The review covers the development of various alignment algorithms, including those based on hash tables, suffix trees, and merge sorting, with a focus on their practical applications and improvements. Key advancements include the use of spaced seeds, q-gram filters, and multiple seed hits to enhance sensitivity and accuracy. The article also discusses the importance of gapped alignment and paired-end mapping in variant discovery and the role of base quality scores in improving alignment accuracy. Additionally, it explores the challenges and solutions for aligning long reads, SOLiD reads, bisulfite-treated reads, and spliced reads. The authors conclude that short-read alignment is no longer a bottleneck in data analysis, and future developments will focus on long-read alignment, de novo assembly, and multi-genome alignment. The article also touches on the potential of cloud computing to address data sharing and computational resource limitations.
Reach us at info@study.space