Minimap2: pairwise alignment for nucleotide sequences

Minimap2: pairwise alignment for nucleotide sequences

16 Mar 2018 | Heng Li
Minimap2 is a versatile and efficient alignment program designed to handle ultra-long reads from sequencing technologies such as SMRT and Oxford Nanopore. It can align DNA or long mRNA sequences against large reference databases, including short reads (≥100bp), genomic reads (≥1kb at 15% error rate), full-length noisy RNA or cDNA reads, and assembly contigs or full chromosomes (hundreds of megabases). Minimap2 employs a seed-chain-align procedure, using minimizers for indexing and seeding, and applies dynamic programming for base-level alignment. It introduces a 2-piece affine gap cost for long insertions and deletions (INDELs) and uses heuristics to reduce spurious alignments. Minimap2 is significantly faster than mainstream short-read mappers and outperforms specialized long-read aligners in terms of speed and accuracy. The algorithm is implemented in C and available under the MIT license, with APIs in C and Python. Evaluations on simulated and real data show that Minimap2 is highly accurate and efficient, outperforming other aligners in various applications, including long genomic reads, spliced reads, short paired-end reads, and long-read assemblies.Minimap2 is a versatile and efficient alignment program designed to handle ultra-long reads from sequencing technologies such as SMRT and Oxford Nanopore. It can align DNA or long mRNA sequences against large reference databases, including short reads (≥100bp), genomic reads (≥1kb at 15% error rate), full-length noisy RNA or cDNA reads, and assembly contigs or full chromosomes (hundreds of megabases). Minimap2 employs a seed-chain-align procedure, using minimizers for indexing and seeding, and applies dynamic programming for base-level alignment. It introduces a 2-piece affine gap cost for long insertions and deletions (INDELs) and uses heuristics to reduce spurious alignments. Minimap2 is significantly faster than mainstream short-read mappers and outperforms specialized long-read aligners in terms of speed and accuracy. The algorithm is implemented in C and available under the MIT license, with APIs in C and Python. Evaluations on simulated and real data show that Minimap2 is highly accurate and efficient, outperforming other aligners in various applications, including long genomic reads, spliced reads, short paired-end reads, and long-read assemblies.
Reach us at info@study.space
[slides] Minimap2%3A pairwise alignment for nucleotide sequences | StudySpace