[slides] Assembly of long%2C error-prone reads using repeat graphs

The paper introduces the Flye algorithm, which constructs an assembly graph from long error-prone reads using an alignment-based approach, contrasting the traditional de Bruijn graph and overlap-layout-consensus (OLC) methods. Unlike OLC, Flye does not attempt to generate accurate contigs initially but instead generates arbitrary paths in the unknown assembly graph, which are then used to build a more accurate assembly graph. This approach effectively resolves unbridged repeats and constructs a less tangled assembly graph. The authors benchmark Flye against several state-of-the-art single molecule sequencing (SMS) assemblers and demonstrate that it generates better or comparable assemblies for various datasets, including bacterial, yeast, worm, and human genomes. Flye's ability to handle complex assembly graphs and resolve unbridged repeats makes it a promising tool for SMS genome assembly.The paper introduces the Flye algorithm, which constructs an assembly graph from long error-prone reads using an alignment-based approach, contrasting the traditional de Bruijn graph and overlap-layout-consensus (OLC) methods. Unlike OLC, Flye does not attempt to generate accurate contigs initially but instead generates arbitrary paths in the unknown assembly graph, which are then used to build a more accurate assembly graph. This approach effectively resolves unbridged repeats and constructs a less tangled assembly graph. The authors benchmark Flye against several state-of-the-art single molecule sequencing (SMS) assemblers and demonstrate that it generates better or comparable assemblies for various datasets, including bacterial, yeast, worm, and human genomes. Flye's ability to handle complex assembly graphs and resolve unbridged repeats makes it a promising tool for SMS genome assembly.

Assembly of Long Error-Prone Reads Using Repeat Graphs

January 12, 2018 | Mikhail Kolmogorov, Jeffrey Yuan, Yu Lin, and Pavel. A. Pevzner