ABySS: A parallel assembler for short read sequence data

ABySS: A parallel assembler for short read sequence data

2009 | Jared T. Simpson, Kim Wong, Shaun D. Jackman, Jacqueline E. Schein, Steven J.M. Jones, and İnanç Birol
ABySS (Assembly By Short Sequencing) is a parallel sequence assembler designed to handle the large-scale sequencing data generated by next-generation sequencing technologies. The primary innovation in ABySS is a distributed representation of a de Bruijn graph, which allows for parallel computation of the assembly algorithm across a network of commodity computers. This approach enables the efficient assembly of billions of short reads from a human resequencing project. The authors demonstrate the capability of ABySS by assembling 35.5 billion paired-end reads from the genome of an African male, creating approximately 2.76 million contigs ≥100 base pairs (bp) in length with an N50 size of 1499 bp, representing 68% of the reference human genome. Analysis of these contigs identified polymorphic and novel sequences not present in the human reference assembly, which were validated by alignment to alternate human assemblies and other primate genomes. The results show that ABySS can accurately and efficiently assemble large-scale sequencing data, making it a valuable tool for de novo assembly of short read sequence data.ABySS (Assembly By Short Sequencing) is a parallel sequence assembler designed to handle the large-scale sequencing data generated by next-generation sequencing technologies. The primary innovation in ABySS is a distributed representation of a de Bruijn graph, which allows for parallel computation of the assembly algorithm across a network of commodity computers. This approach enables the efficient assembly of billions of short reads from a human resequencing project. The authors demonstrate the capability of ABySS by assembling 35.5 billion paired-end reads from the genome of an African male, creating approximately 2.76 million contigs ≥100 base pairs (bp) in length with an N50 size of 1499 bp, representing 68% of the reference human genome. Analysis of these contigs identified polymorphic and novel sequences not present in the human reference assembly, which were validated by alignment to alternate human assemblies and other primate genomes. The results show that ABySS can accurately and efficiently assemble large-scale sequencing data, making it a valuable tool for de novo assembly of short read sequence data.
Reach us at info@study.space