20 July 2007 | Susan M Huse, Julie A Huber, Hilary G Morrison, Mitchell L Sogin and David Mark Welch
The article by Huse et al. (2007) evaluates the accuracy and quality of massively parallel DNA pyrosequencing using the Roche GS20 system. The study focuses on the V6 hypervariable region of microbial ribosomal DNA (rDNA) to assess the error rate and identify factors that can improve data quality. The authors found a 99.5% accuracy rate in unassembled sequences, which can be further improved to 99.75% or better by removing low-quality reads. They identified several diagnostic features, such as average quality score, length, and the presence of ambiguous base calls, that correlate with the presence of errors. The study highlights that a small proportion of low-quality reads, likely from multi-templated beads, are responsible for the majority of sequencing errors. Removing reads containing ambiguous bases significantly reduces the overall error rate from about 0.5% to 0.25%. The authors conclude that their strategy for detecting low-quality reads can improve the accuracy of pyrosequencing data in molecular ecology studies, surpassing the accuracy of traditional capillary methods.The article by Huse et al. (2007) evaluates the accuracy and quality of massively parallel DNA pyrosequencing using the Roche GS20 system. The study focuses on the V6 hypervariable region of microbial ribosomal DNA (rDNA) to assess the error rate and identify factors that can improve data quality. The authors found a 99.5% accuracy rate in unassembled sequences, which can be further improved to 99.75% or better by removing low-quality reads. They identified several diagnostic features, such as average quality score, length, and the presence of ambiguous base calls, that correlate with the presence of errors. The study highlights that a small proportion of low-quality reads, likely from multi-templated beads, are responsible for the majority of sequencing errors. Removing reads containing ambiguous bases significantly reduces the overall error rate from about 0.5% to 0.25%. The authors conclude that their strategy for detecting low-quality reads can improve the accuracy of pyrosequencing data in molecular ecology studies, surpassing the accuracy of traditional capillary methods.