January 26, 2018 | Guillaume Marçais, Arthur L. Delcher, Adam M. Phillippy, Rachel Coston, Steven L. Salzberg, Aleksey Zimin
MUMmer4 is an advanced genome alignment system that addresses the limitations of its predecessor, MUMmer3, particularly in handling large genomes and very large sequence datasets. The key improvements include a switch from a 32-bit suffix tree to a 48-bit suffix array, which allows MUMmer4 to process sequences of any biologically realistic length, up to a theoretical limit of 141 trillion bases. This change enables MUMmer4 to handle large genomes efficiently, as demonstrated by aligning the human and chimpanzee genomes, which showed 98% identity across 96% of their length. Additionally, MUMmer4 supports parallel processing, significantly reducing run times for large datasets. The software can now be called from scripting languages like Perl, Python, and Ruby, enhancing its versatility. MUMmer4 also offers improved speed and efficiency in aligning reads to reference genomes, although it may be less sensitive and accurate compared to specialized read aligners. The paper provides detailed comparisons with other aligners, showing that MUMmer4 is significantly faster and more versatile, making it a valuable tool for various genomic applications.MUMmer4 is an advanced genome alignment system that addresses the limitations of its predecessor, MUMmer3, particularly in handling large genomes and very large sequence datasets. The key improvements include a switch from a 32-bit suffix tree to a 48-bit suffix array, which allows MUMmer4 to process sequences of any biologically realistic length, up to a theoretical limit of 141 trillion bases. This change enables MUMmer4 to handle large genomes efficiently, as demonstrated by aligning the human and chimpanzee genomes, which showed 98% identity across 96% of their length. Additionally, MUMmer4 supports parallel processing, significantly reducing run times for large datasets. The software can now be called from scripting languages like Perl, Python, and Ruby, enhancing its versatility. MUMmer4 also offers improved speed and efficiency in aligning reads to reference genomes, although it may be less sensitive and accurate compared to specialized read aligners. The paper provides detailed comparisons with other aligners, showing that MUMmer4 is significantly faster and more versatile, making it a valuable tool for various genomic applications.