The MaSuRCA genome assembler

The MaSuRCA genome assembler

August 29, 2013 | Aleksey V. Zimin, Guillaume Marçais, Daniela Puiu, Michael Roberts, Steven L. Salzberg and James A. Yorke
The MaSuRCA genome assembler is a new hybrid method that combines the computational efficiency of de Bruijn graph approaches with the flexibility of overlap-based assembly strategies. It allows variable read lengths and tolerates sequencing errors by transforming paired-end reads into longer 'super-reads'. This enables the assembler to handle mixtures of Illumina, 454, and Sanger reads. MaSuRCA was evaluated against Allpaths-LG and SOAPdenovo2 on two genomes with known high-quality assemblies: Rhodobacter sphaeroides and mouse chromosome 16. It performed on par or better than Allpaths-LG and significantly better than SOAPdenovo2. When combined with long reads, MaSuRCA significantly improved its assemblies. The assembler uses a modified version of the CABOG assembler for contigging and scaffolding, and it reduces the number of reads by creating super-reads, which contain all the sequence information from the original reads. The super-reads are then used to create assemblies with minimal data loss. The MaSuRCA assembler was tested on two organisms: Rhodobacter sphaeroides and mouse chromosome 16. It showed improved performance compared to other assemblers, especially when combined with long reads. The assembler uses error-corrected reads to create super-reads, which are then used to assemble the genome. The final step involves gap filling, which uses super-reads to fill gaps in scaffolds. MaSuRCA has been used to assemble a variety of genomes, including large ones like the 22 Gbp loblolly pine genome. It has also been used to assemble genomes with mixed read data from different sequencing technologies. The assembler has been shown to be effective in assembling genomes with high GC content, which is challenging for other assemblers. The MaSuRCA assembler is available as open-source software and has been used to assemble a variety of genomes, including some that were previously not publicly available. The assembler has been shown to be effective in assembling genomes with high GC content, which is challenging for other assemblers. The MaSuRCA assembler is a powerful tool for genome assembly, especially when combined with long reads. It has been used to assemble a variety of genomes, including large ones like the 22 Gbp loblolly pine genome. The assembler has been shown to be effective in assembling genomes with high GC content, which is challenging for other assemblers.The MaSuRCA genome assembler is a new hybrid method that combines the computational efficiency of de Bruijn graph approaches with the flexibility of overlap-based assembly strategies. It allows variable read lengths and tolerates sequencing errors by transforming paired-end reads into longer 'super-reads'. This enables the assembler to handle mixtures of Illumina, 454, and Sanger reads. MaSuRCA was evaluated against Allpaths-LG and SOAPdenovo2 on two genomes with known high-quality assemblies: Rhodobacter sphaeroides and mouse chromosome 16. It performed on par or better than Allpaths-LG and significantly better than SOAPdenovo2. When combined with long reads, MaSuRCA significantly improved its assemblies. The assembler uses a modified version of the CABOG assembler for contigging and scaffolding, and it reduces the number of reads by creating super-reads, which contain all the sequence information from the original reads. The super-reads are then used to create assemblies with minimal data loss. The MaSuRCA assembler was tested on two organisms: Rhodobacter sphaeroides and mouse chromosome 16. It showed improved performance compared to other assemblers, especially when combined with long reads. The assembler uses error-corrected reads to create super-reads, which are then used to assemble the genome. The final step involves gap filling, which uses super-reads to fill gaps in scaffolds. MaSuRCA has been used to assemble a variety of genomes, including large ones like the 22 Gbp loblolly pine genome. It has also been used to assemble genomes with mixed read data from different sequencing technologies. The assembler has been shown to be effective in assembling genomes with high GC content, which is challenging for other assemblers. The MaSuRCA assembler is available as open-source software and has been used to assemble a variety of genomes, including some that were previously not publicly available. The assembler has been shown to be effective in assembling genomes with high GC content, which is challenging for other assemblers. The MaSuRCA assembler is a powerful tool for genome assembly, especially when combined with long reads. It has been used to assemble a variety of genomes, including large ones like the 22 Gbp loblolly pine genome. The assembler has been shown to be effective in assembling genomes with high GC content, which is challenging for other assemblers.
Reach us at info@study.space
[slides and audio] The MaSuRCA genome assembler