APRIL 2018 | Miten Jain, Sergey Koren, Karen H Miga, Josh Quick, Arthur C Rand, Thomas A Sasani, John R Tyson, Andrew D Beggs, Alexander T Dilthey, Ian T Fiddes, Sunir Malla, Hannah Marriott, Tom Nieto, Justin O'Grady, Hugh E Olsen, Brent S Pedersen, Arang Rhie, Hollian Richardson, Aaron R Quinlan, Terrance P Snutch, Louise Tee, Benedict Paten, Adam M Phillippy, Jared T Simpson, Nicholas J Loman & Matthew Loose
Researchers have successfully sequenced and assembled a human genome using the MinION nanopore sequencer, achieving a high level of accuracy and contiguity. The study involved sequencing 91.2 Gb of data, representing approximately 30× theoretical coverage, and generating ultra-long reads (N50 > 100 kb, up to 882 kb). Incorporating additional 5× coverage of these ultra-long reads more than doubled the assembly contiguity, resulting in an NG50 of 6.4 Mb. The final assembled genome was 2,867 million bases in size, covering 85.8% of the reference. Assembly accuracy, after incorporating complementary short-read sequencing data, exceeded 99.8%. Ultra-long reads enabled the assembly and phasing of the 4-Mb major histocompatibility complex (MHC) locus, measurement of telomere repeat length, and closure of gaps in the reference human genome assembly GRCh38.
The human genome is used as a benchmark for DNA sequencing instruments. Despite advancements in sequencing technology, assembling human genomes with high accuracy and completeness remains challenging due to the genome's size, heterozygosity, GC% bias, repetitive sequences, and segmental duplications. Single-molecule sequencers like PacBio can produce longer reads, but they have higher error rates. The MinION nanopore sequencer, with improvements in protein pore, library preparation, sequencing speed, and control software, has enabled whole-genome sequencing of a human genome using only a MinION sequencer.
The study used MinION R9.41D chemistry to sequence and assemble a reference human genome for GM12878. The sequencing data included ultra-long reads up to 882 kb in length. The assembly was performed with reads base-called by Metrichor, resulting in a whole genome assembly with an NG50 of 3 Mb. After incorporating complementary short-read sequencing data, the assembly accuracy was improved to 99.88%. The study also demonstrated the ability to phase the MHC locus and resolve complex repeat regions.
The study also showed that ultra-long reads can close gaps in the human reference genome, including 12 gaps each over 50 kb in length. The study identified 83,980 bp of previously unknown euchromatic sequence. Additionally, the study measured telomere repeat lengths, finding evidence for telomeric arrays spanning 2–11 kb within 14 subtelomeric regions for GM12878.
The study also evaluated the accuracy of nanopore data for calling genotypes at known single-nucleotide polymorphisms (SNPs) and structural variants (SVs). The results showed that nanopore data could accurately genotype SNPs and SVs, with high concordance with Illumina and PacBio data. The study also demonstrated the ability to detectResearchers have successfully sequenced and assembled a human genome using the MinION nanopore sequencer, achieving a high level of accuracy and contiguity. The study involved sequencing 91.2 Gb of data, representing approximately 30× theoretical coverage, and generating ultra-long reads (N50 > 100 kb, up to 882 kb). Incorporating additional 5× coverage of these ultra-long reads more than doubled the assembly contiguity, resulting in an NG50 of 6.4 Mb. The final assembled genome was 2,867 million bases in size, covering 85.8% of the reference. Assembly accuracy, after incorporating complementary short-read sequencing data, exceeded 99.8%. Ultra-long reads enabled the assembly and phasing of the 4-Mb major histocompatibility complex (MHC) locus, measurement of telomere repeat length, and closure of gaps in the reference human genome assembly GRCh38.
The human genome is used as a benchmark for DNA sequencing instruments. Despite advancements in sequencing technology, assembling human genomes with high accuracy and completeness remains challenging due to the genome's size, heterozygosity, GC% bias, repetitive sequences, and segmental duplications. Single-molecule sequencers like PacBio can produce longer reads, but they have higher error rates. The MinION nanopore sequencer, with improvements in protein pore, library preparation, sequencing speed, and control software, has enabled whole-genome sequencing of a human genome using only a MinION sequencer.
The study used MinION R9.41D chemistry to sequence and assemble a reference human genome for GM12878. The sequencing data included ultra-long reads up to 882 kb in length. The assembly was performed with reads base-called by Metrichor, resulting in a whole genome assembly with an NG50 of 3 Mb. After incorporating complementary short-read sequencing data, the assembly accuracy was improved to 99.88%. The study also demonstrated the ability to phase the MHC locus and resolve complex repeat regions.
The study also showed that ultra-long reads can close gaps in the human reference genome, including 12 gaps each over 50 kb in length. The study identified 83,980 bp of previously unknown euchromatic sequence. Additionally, the study measured telomere repeat lengths, finding evidence for telomeric arrays spanning 2–11 kb within 14 subtelomeric regions for GM12878.
The study also evaluated the accuracy of nanopore data for calling genotypes at known single-nucleotide polymorphisms (SNPs) and structural variants (SVs). The results showed that nanopore data could accurately genotype SNPs and SVs, with high concordance with Illumina and PacBio data. The study also demonstrated the ability to detect