BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs

BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs

2015 | Felipe A. Simão, Robert M. Waterhouse, Panagiotis Ioannidis, Evgenia V. Kriventseva and Evgeny M. Zdobnov
The article introduces BUSCO (Benchmarking Universal Single-Copy Orthologs), a tool for assessing the completeness of genome assemblies and annotations. BUSCO uses single-copy orthologs to evaluate gene content, providing a quantitative measure of assembly quality. The authors developed BUSCO sets for six major phylogenetic clades, including vertebrates, arthropods, metazoans, fungi, and eukaryotes, each containing 3023, 2675, 843, 1438, and 429 genes, respectively. These sets are used to assess genome assemblies, annotated gene sets, and transcriptomes. The BUSCO assessment workflow classifies genes as 'complete,' 'duplicated,' 'fragmented,' or 'missing,' with the 'number of genes used' indicating the resolution and confidence of the assessment. The tool is implemented in Python and is available for download from http://busco.ezlab.org. The authors report that BUSCO assessments provide high-resolution quantifications that can be used to compare newly sequenced draft genome assemblies to gold-standard models or to track iterative improvements in assemblies or annotations.The article introduces BUSCO (Benchmarking Universal Single-Copy Orthologs), a tool for assessing the completeness of genome assemblies and annotations. BUSCO uses single-copy orthologs to evaluate gene content, providing a quantitative measure of assembly quality. The authors developed BUSCO sets for six major phylogenetic clades, including vertebrates, arthropods, metazoans, fungi, and eukaryotes, each containing 3023, 2675, 843, 1438, and 429 genes, respectively. These sets are used to assess genome assemblies, annotated gene sets, and transcriptomes. The BUSCO assessment workflow classifies genes as 'complete,' 'duplicated,' 'fragmented,' or 'missing,' with the 'number of genes used' indicating the resolution and confidence of the assessment. The tool is implemented in Python and is available for download from http://busco.ezlab.org. The authors report that BUSCO assessments provide high-resolution quantifications that can be used to compare newly sequenced draft genome assemblies to gold-standard models or to track iterative improvements in assemblies or annotations.
Reach us at info@study.space
Understanding BUSCO%3A assessing genome assembly and annotation completeness with single-copy orthologs