2018 | Wouter De Coster, Svenn D'Hert, Darrin T. Schultz, Marc Cruts and Christine Van Broeckhoven
NanoPack is a set of Python tools for visualizing and processing long-read sequencing data from Oxford Nanopore Technologies and Pacific Biosciences. It addresses the limitations of existing quality control (QC) tools, which are optimized for short-read sequencing technologies like Illumina. NanoPack provides flexible and customizable statistical evaluation and visualization tools, as well as data processing scripts for filtering and trimming reads, and removing contaminant DNA.
NanoPlot and NanoComp are key tools in NanoPack. NanoPlot generates statistical summaries and QC graphs, including read length histograms, cumulative yield plots, and bivariate plots comparing read lengths with quality scores. NanoComp compares read length and quality distributions across different datasets, including barcodes or experiments. These tools can produce plots in various formats, including PNG, JPG, PDF, and SVG, and can be used to analyze large datasets with 2D kernel density estimation or hexagonal bins.
NanoFilt and NanoLyse are used for processing reads in streaming applications. NanoFilt filters reads based on quality, length, and GC content, while NanoLyse removes contaminant DNA using the Minimap2 aligner. These tools are designed to integrate into existing pipelines before alignment.
NanoPack is available on major operating systems and can be installed via PyPI and bioconda. It provides a graphical user interface, a web service, and command-line tools. The software is designed to be user-friendly and accessible, with comprehensive reports that include all summary statistics and plots. NanoPack is a valuable tool for researchers working with long-read sequencing data, offering a flexible and efficient solution for data visualization and processing.NanoPack is a set of Python tools for visualizing and processing long-read sequencing data from Oxford Nanopore Technologies and Pacific Biosciences. It addresses the limitations of existing quality control (QC) tools, which are optimized for short-read sequencing technologies like Illumina. NanoPack provides flexible and customizable statistical evaluation and visualization tools, as well as data processing scripts for filtering and trimming reads, and removing contaminant DNA.
NanoPlot and NanoComp are key tools in NanoPack. NanoPlot generates statistical summaries and QC graphs, including read length histograms, cumulative yield plots, and bivariate plots comparing read lengths with quality scores. NanoComp compares read length and quality distributions across different datasets, including barcodes or experiments. These tools can produce plots in various formats, including PNG, JPG, PDF, and SVG, and can be used to analyze large datasets with 2D kernel density estimation or hexagonal bins.
NanoFilt and NanoLyse are used for processing reads in streaming applications. NanoFilt filters reads based on quality, length, and GC content, while NanoLyse removes contaminant DNA using the Minimap2 aligner. These tools are designed to integrate into existing pipelines before alignment.
NanoPack is available on major operating systems and can be installed via PyPI and bioconda. It provides a graphical user interface, a web service, and command-line tools. The software is designed to be user-friendly and accessible, with comprehensive reports that include all summary statistics and plots. NanoPack is a valuable tool for researchers working with long-read sequencing data, offering a flexible and efficient solution for data visualization and processing.