Nanopore sequencing and assembly of a human genome with ultra-long reads

Nanopore sequencing and assembly of a human genome with ultra-long reads

29 January 2018 | Miten Jain, Sergey Koren, Karen H Miga, Josh Quick, Arthur C Rand, Thomas A Sasani, John R Tyson, Andrew D Beggs, Alexander T Dilthey, Ian T Fiddes, Sunir Malla, Hannah Marriott, Tom Nieto, Justin O'Grady, Hugh E Olsen, Brent S Pedersen, Arang Rhie, Hollian Richardson, Aaron R Quinlan, Terrance P Snutch, Louise Tee, Benedict Paten, Adam M Phillippy, Jared T Simpson, Nicholas J Loman, Matthew Loose
The authors report the sequencing and assembly of a reference human genome using the MinION nanopore sequencer. They generated 91.2 Gb of sequence data, representing ~30× theoretical coverage, from the GM12878 Utah/Ceph cell line. Reference-based alignment enabled the detection of large structural variants and epigenetic modifications. De novo assembly of nanopore reads alone yielded a contig size of ~3 Mb. The authors developed a protocol to generate ultra-long reads (N50 > 100 kb, read lengths up to 882 kb) and incorporated an additional 5× coverage of these ultra-long reads, which more than doubled the assembly contiguity to ~6.4 Mb. The final assembled genome was 2,867 million bases in size, covering 85.8% of the reference. Assembly accuracy, after incorporating complementary short-read sequencing data, exceeded 99.8%. Ultra-long reads enabled the assembly and phasing of the 4-Mb major histocompatibility complex (MHC) locus in its entirety, measurement of telomere repeat length, and closure of gaps in the reference human genome assembly GRCh38.The authors report the sequencing and assembly of a reference human genome using the MinION nanopore sequencer. They generated 91.2 Gb of sequence data, representing ~30× theoretical coverage, from the GM12878 Utah/Ceph cell line. Reference-based alignment enabled the detection of large structural variants and epigenetic modifications. De novo assembly of nanopore reads alone yielded a contig size of ~3 Mb. The authors developed a protocol to generate ultra-long reads (N50 > 100 kb, read lengths up to 882 kb) and incorporated an additional 5× coverage of these ultra-long reads, which more than doubled the assembly contiguity to ~6.4 Mb. The final assembled genome was 2,867 million bases in size, covering 85.8% of the reference. Assembly accuracy, after incorporating complementary short-read sequencing data, exceeded 99.8%. Ultra-long reads enabled the assembly and phasing of the 4-Mb major histocompatibility complex (MHC) locus in its entirety, measurement of telomere repeat length, and closure of gaps in the reference human genome assembly GRCh38.
Reach us at info@study.space
[slides] Nanopore sequencing and assembly of a human genome with ultra-long reads | StudySpace