Skewer: a fast and accurate adapter trimmer for next-generation sequencing paired-end reads

Skewer: a fast and accurate adapter trimmer for next-generation sequencing paired-end reads

2014 | Jiang, Hongshan; Lei, Rong; Ding, Shou-Wei; Zhu, Shuifang
Skewer is a fast and accurate adapter trimmer for next-generation sequencing (NGS) paired-end reads. It uses a novel bit-masked k-difference matching algorithm with expected time complexity of O(kn) and space complexity of O(m), where k is the maximum number of differences allowed, n is the read length, and m is the adapter length. This algorithm allows efficient enumeration of candidate sequences that meet a specified threshold, such as error ratio. Skewer also incorporates a statistical scoring scheme to evaluate candidates during pattern matching and utilizes paired-end/mate-pair information when applicable. The tool was implemented in an industry-standard Linux program called Skewer. Experiments on simulated and real data showed that Skewer outperformed other similar tools in terms of accuracy and speed. It was significantly faster than other tools with comparable accuracy, being one time faster for single-end sequencing, more than 12 times faster for paired-end sequencing, and 49% faster for LMP sequencing. Skewer also demonstrated high accuracy, achieving unmatched performance in adapter trimming with low time bound. Skewer was tested on various NGS applications, including sRNA sequencing, paired-end RNA sequencing, and Nextera LMP sequencing. It performed well in all cases, with high sensitivity and specificity. The tool was also efficient in parallel computing, achieving the highest speedup among adapter trimmers tested. Skewer's performance was evaluated using metrics such as positive predictive value (PPV), sensitivity (Sen), and specificity (Spec). It outperformed other adapter trimmers in these metrics, particularly in high stringency conditions. The tool was also effective in handling paired-end information, reducing false positives and improving overall accuracy. Skewer was found to be more efficient than other tools in terms of memory usage and processing speed. It was able to handle large datasets efficiently and was suitable for various NGS applications. The tool's ability to handle base-call quality values and paired-end information made it a versatile and accurate adapter trimmer for NGS data.Skewer is a fast and accurate adapter trimmer for next-generation sequencing (NGS) paired-end reads. It uses a novel bit-masked k-difference matching algorithm with expected time complexity of O(kn) and space complexity of O(m), where k is the maximum number of differences allowed, n is the read length, and m is the adapter length. This algorithm allows efficient enumeration of candidate sequences that meet a specified threshold, such as error ratio. Skewer also incorporates a statistical scoring scheme to evaluate candidates during pattern matching and utilizes paired-end/mate-pair information when applicable. The tool was implemented in an industry-standard Linux program called Skewer. Experiments on simulated and real data showed that Skewer outperformed other similar tools in terms of accuracy and speed. It was significantly faster than other tools with comparable accuracy, being one time faster for single-end sequencing, more than 12 times faster for paired-end sequencing, and 49% faster for LMP sequencing. Skewer also demonstrated high accuracy, achieving unmatched performance in adapter trimming with low time bound. Skewer was tested on various NGS applications, including sRNA sequencing, paired-end RNA sequencing, and Nextera LMP sequencing. It performed well in all cases, with high sensitivity and specificity. The tool was also efficient in parallel computing, achieving the highest speedup among adapter trimmers tested. Skewer's performance was evaluated using metrics such as positive predictive value (PPV), sensitivity (Sen), and specificity (Spec). It outperformed other adapter trimmers in these metrics, particularly in high stringency conditions. The tool was also effective in handling paired-end information, reducing false positives and improving overall accuracy. Skewer was found to be more efficient than other tools in terms of memory usage and processing speed. It was able to handle large datasets efficiently and was suitable for various NGS applications. The tool's ability to handle base-call quality values and paired-end information made it a versatile and accurate adapter trimmer for NGS data.
Reach us at info@study.space