Fast and accurate long-read assembly with wtdbg2

Fast and accurate long-read assembly with wtdbg2

2020 February ; 17(2): 155–158 | Jue Ruan, Heng Li
The paper introduces wtdbg2, a new long-read assembler designed to address the computational challenges and high costs associated with assembling mammalian genomes using existing tools. wtdbg2 is significantly faster (2-17 times) than published tools while maintaining comparable contiguity and accuracy. It achieves this by employing a fast all-vs-all read alignment implementation and a novel layout algorithm based on a fuzzy-Brujin graph (FBG). The FBG extends the concept of de Bruijn graphs to handle long, noisy reads, reducing memory usage and improving efficiency. The authors evaluated wtdbg2 on various datasets, demonstrating its superior performance and contiguity compared to other assemblers like CANU, Flye, and MECAT. wtdbg2 is particularly effective for large, non-human genomes, such as the axolotl genome, and shows promise for population-scale long-read assembly. The paper also discusses the importance of polishing consensus sequences and highlights the need for further improvements in the polishing step. Overall, wtdbg2 represents a significant advancement in long-read assembly technology, poised to revolutionize sequence data analysis.The paper introduces wtdbg2, a new long-read assembler designed to address the computational challenges and high costs associated with assembling mammalian genomes using existing tools. wtdbg2 is significantly faster (2-17 times) than published tools while maintaining comparable contiguity and accuracy. It achieves this by employing a fast all-vs-all read alignment implementation and a novel layout algorithm based on a fuzzy-Brujin graph (FBG). The FBG extends the concept of de Bruijn graphs to handle long, noisy reads, reducing memory usage and improving efficiency. The authors evaluated wtdbg2 on various datasets, demonstrating its superior performance and contiguity compared to other assemblers like CANU, Flye, and MECAT. wtdbg2 is particularly effective for large, non-human genomes, such as the axolotl genome, and shows promise for population-scale long-read assembly. The paper also discusses the importance of polishing consensus sequences and highlights the need for further improvements in the polishing step. Overall, wtdbg2 represents a significant advancement in long-read assembly technology, poised to revolutionize sequence data analysis.
Reach us at info@study.space