[slides and audio] MUMmer4%3A A fast and versatile genome alignment system

MUMmer4 is a fast and versatile genome alignment system that improves upon its predecessor, MUMmer3. It addresses the limitations of MUMmer3 by replacing the 32-bit suffix tree with a 48-bit suffix array, allowing it to handle larger genomes and more data. MUMmer4 also supports parallel processing, significantly improving speed. It can align large genomes, such as the human and chimpanzee genomes, with high accuracy, showing that the two species are 98% identical across 96% of their length. MUMmer4 can also align reads to reference genomes, although it is less sensitive and accurate than dedicated read aligners. It can be called from scripting languages like Perl, Python, and Ruby, making it more versatile. MUMmer4's suffix array data structure ensures linear alignment time relative to read length, similar to other NGS aligners. It is faster than other aligners like Mauve and LASTZ. MUMmer4 can handle large reference and query sequences, with a theoretical limit of 141 Tb for the reference. It also allows for batch processing of large reference sequences. MUMmer4 supports both delta and SAM output formats, making it compatible with various tools. It is available under an open-source license and is suitable for a wide range of genome alignment tasks.MUMmer4 is a fast and versatile genome alignment system that improves upon its predecessor, MUMmer3. It addresses the limitations of MUMmer3 by replacing the 32-bit suffix tree with a 48-bit suffix array, allowing it to handle larger genomes and more data. MUMmer4 also supports parallel processing, significantly improving speed. It can align large genomes, such as the human and chimpanzee genomes, with high accuracy, showing that the two species are 98% identical across 96% of their length. MUMmer4 can also align reads to reference genomes, although it is less sensitive and accurate than dedicated read aligners. It can be called from scripting languages like Perl, Python, and Ruby, making it more versatile. MUMmer4's suffix array data structure ensures linear alignment time relative to read length, similar to other NGS aligners. It is faster than other aligners like Mauve and LASTZ. MUMmer4 can handle large reference and query sequences, with a theoretical limit of 141 Tb for the reference. It also allows for batch processing of large reference sequences. MUMmer4 supports both delta and SAM output formats, making it compatible with various tools. It is available under an open-source license and is suitable for a wide range of genome alignment tasks.

MUMmer4: A fast and versatile genome alignment system

January 26, 2018 | Guillaume Marçais, Arthur L. Delcher, Adam M. Phillippy, Rachel Coston, Steven L. Salzberg, Aleksey Zimin