| Robert M. Waterhouse, Mathieu Seppey, Felipe A. Simão, Mosè Manni, Panagiotis Ioannidis, Guennadi Klioutchnikov, Evgenia V. Kriventseva & Evgeny M. Zdobnov
The article discusses the applications and updates of BUSCO (Benchmarking Universal Single-Copy Orthologs), a tool designed to assess the completeness of genome assemblies, annotated gene sets, and transcriptomes. BUSCO uses single-copy orthologs from OrthoDB to provide quantitative measures of gene content completeness, which are crucial for genomics data quality control. The tool has been widely adopted by the genomics community for various purposes, including building robust training sets for gene predictors, selecting high-quality reference strains or species for comparative analyses, and identifying reliable markers for large-scale phylogenomics studies.
Key updates to BUSCO include enhanced features and extended datasets, making it more flexible and user-friendly. The tool now supports multiple lineages with larger, lineage-specific datasets, improving the resolution of assessments. BUSCO v2 and v3 have improved the underlying analysis software, making assessments faster and providing more comprehensive information on the progress of analyses.
The article also highlights several applications of BUSCO:
1. **Genomics Data Quality Control**: BUSCO assessments help guide iterative genome reassemblies and annotations, as demonstrated by improvements in the Postman butterfly (*Heliconius melpomene*) and Atlantic cod (*Gadus morhua*).
2. **Gene Predictor Training Sets**: BUSCO-trained parameters significantly improve the quality of gene model annotations, as shown in comparisons with pre-trained parameters.
3. **Comparative Genomic Analyses**: BUSCO completeness metrics help select the most complete genomic resources for comparative studies, such as in the analysis of Streptomyces and Lactobacillus genomes.
4. **Reliable Phylogenomics Markers**: BUSCO identifies reliable single-copy markers from different types of genomic data, facilitating large-scale phylogenomic studies, as illustrated by the analysis of rodent genomes and transcriptomes.
Overall, BUSCO remains an essential resource for genomics research, offering a comprehensive and flexible approach to assessing and improving the quality of genomic data.The article discusses the applications and updates of BUSCO (Benchmarking Universal Single-Copy Orthologs), a tool designed to assess the completeness of genome assemblies, annotated gene sets, and transcriptomes. BUSCO uses single-copy orthologs from OrthoDB to provide quantitative measures of gene content completeness, which are crucial for genomics data quality control. The tool has been widely adopted by the genomics community for various purposes, including building robust training sets for gene predictors, selecting high-quality reference strains or species for comparative analyses, and identifying reliable markers for large-scale phylogenomics studies.
Key updates to BUSCO include enhanced features and extended datasets, making it more flexible and user-friendly. The tool now supports multiple lineages with larger, lineage-specific datasets, improving the resolution of assessments. BUSCO v2 and v3 have improved the underlying analysis software, making assessments faster and providing more comprehensive information on the progress of analyses.
The article also highlights several applications of BUSCO:
1. **Genomics Data Quality Control**: BUSCO assessments help guide iterative genome reassemblies and annotations, as demonstrated by improvements in the Postman butterfly (*Heliconius melpomene*) and Atlantic cod (*Gadus morhua*).
2. **Gene Predictor Training Sets**: BUSCO-trained parameters significantly improve the quality of gene model annotations, as shown in comparisons with pre-trained parameters.
3. **Comparative Genomic Analyses**: BUSCO completeness metrics help select the most complete genomic resources for comparative studies, such as in the analysis of Streptomyces and Lactobacillus genomes.
4. **Reliable Phylogenomics Markers**: BUSCO identifies reliable single-copy markers from different types of genomic data, facilitating large-scale phylogenomic studies, as illustrated by the analysis of rodent genomes and transcriptomes.
Overall, BUSCO remains an essential resource for genomics research, offering a comprehensive and flexible approach to assessing and improving the quality of genomic data.