**fastp: An Ultra-Fast All-In-One FASTQ Preprocessor**
Shifu Chen, Yanqing Zhou, Yaru Chen, and Jia Gu
**Motivation:**
Quality control and preprocessing of FASTQ files are crucial for downstream analysis. Traditional tools often require multiple steps and are inefficient due to repeated data loading and I/O operations. fastp is designed to address these issues by providing a single tool for quality control, adapter trimming, quality filtering, and other operations.
**Results:**
fastp is developed in C++ with multi-threading support, making it significantly faster than other tools like Trimmomatic or Cutadapt. It can perform various operations with a single scan of FASTQ data, including quality control, adapter trimming, quality filtering, and per-read quality pruning. fastp also offers supplementary features such as unique molecular identifier (UMI) preprocessing, per-read polyG tail trimming, and output splitting.
**Availability and Implementation:**
The open-source code and instructions are available at https://github.com/OpenGene/fastp.
**Contact:**
chen@haplox.com
**Introduction:**
Quality control and preprocessing of sequencing data are essential for obtaining high-quality results. fastp integrates multiple functions into one tool, making it more efficient and user-friendly compared to existing tools like FASTQC, Cutadapt, Trimmomatic, and AfterQC. It supports both single-end and paired-end data and provides additional features such as UMI preprocessing and polyG tail trimming.
**Materials and Methods:**
fastp supports multi-threading parallel processing, allowing for efficient data handling. It includes features such as adapter trimming, base correction, sliding window quality pruning, polyG and polyX tail trimming, UMI preprocessing, output splitting, duplication evaluation, and overrepresented sequence analysis. fastp also provides comprehensive quality control reports in HTML and JSON formats.
**Results:**
Experiments comparing fastp with other tools showed that fastp is much faster and provides similar or better quality. It outperforms other tools in speed and quality, making it a popular choice for community users.
**Discussion:**
fastp is a versatile and efficient tool for FASTQ file preprocessing, offering a wide range of features and high-speed performance. Its rich functionality and high speed have made it widely adopted in the community.
**Acknowledgements:**
The authors thank the fastp community for feature requests and bug reports.
**Funding:**
This work was supported by Special Funds for Future Industries of Shenzhen and the National Science Foundation of China.
**Conflict of Interest:**
None declared.**fastp: An Ultra-Fast All-In-One FASTQ Preprocessor**
Shifu Chen, Yanqing Zhou, Yaru Chen, and Jia Gu
**Motivation:**
Quality control and preprocessing of FASTQ files are crucial for downstream analysis. Traditional tools often require multiple steps and are inefficient due to repeated data loading and I/O operations. fastp is designed to address these issues by providing a single tool for quality control, adapter trimming, quality filtering, and other operations.
**Results:**
fastp is developed in C++ with multi-threading support, making it significantly faster than other tools like Trimmomatic or Cutadapt. It can perform various operations with a single scan of FASTQ data, including quality control, adapter trimming, quality filtering, and per-read quality pruning. fastp also offers supplementary features such as unique molecular identifier (UMI) preprocessing, per-read polyG tail trimming, and output splitting.
**Availability and Implementation:**
The open-source code and instructions are available at https://github.com/OpenGene/fastp.
**Contact:**
chen@haplox.com
**Introduction:**
Quality control and preprocessing of sequencing data are essential for obtaining high-quality results. fastp integrates multiple functions into one tool, making it more efficient and user-friendly compared to existing tools like FASTQC, Cutadapt, Trimmomatic, and AfterQC. It supports both single-end and paired-end data and provides additional features such as UMI preprocessing and polyG tail trimming.
**Materials and Methods:**
fastp supports multi-threading parallel processing, allowing for efficient data handling. It includes features such as adapter trimming, base correction, sliding window quality pruning, polyG and polyX tail trimming, UMI preprocessing, output splitting, duplication evaluation, and overrepresented sequence analysis. fastp also provides comprehensive quality control reports in HTML and JSON formats.
**Results:**
Experiments comparing fastp with other tools showed that fastp is much faster and provides similar or better quality. It outperforms other tools in speed and quality, making it a popular choice for community users.
**Discussion:**
fastp is a versatile and efficient tool for FASTQ file preprocessing, offering a wide range of features and high-speed performance. Its rich functionality and high speed have made it widely adopted in the community.
**Acknowledgements:**
The authors thank the fastp community for feature requests and bug reports.
**Funding:**
This work was supported by Special Funds for Future Industries of Shenzhen and the National Science Foundation of China.
**Conflict of Interest:**
None declared.