GTDB-Tk: a toolkit to classify genomes with the Genome Taxonomy Database

GTDB-Tk: a toolkit to classify genomes with the Genome Taxonomy Database

2020 | Pierre-Alain Chaumeil*, Aaron J. Mussig, Philip Hugenholtz and Donovan H. Parks*
GTDB-Tk is a computational tool for classifying bacterial and archaeal genomes using the Genome Taxonomy Database (GTDB). It efficiently assigns taxonomic ranks based on reference trees and uses relative evolutionary divergence (RED) and average nucleotide identity (ANI) to determine classifications. The tool was tested on 10,156 metagenome-assembled genomes (MAGs) and showed high accuracy, with 89.5% of classifications matching manual curation. GTDB-Tk provides more resolved classifications than GTDB in some cases and less in others. It classifies genomes by placing them in domain-specific reference trees and using RED and ANI to determine taxonomic ranks. The tool is implemented in Python and available online. GTDB-Tk is efficient, capable of classifying thousands of genomes in parallel, and is suitable for genomes with at least 50% completeness and ≤10% contamination. It is recommended for use in studies focused on evolutionary relationships or taxonomic reclassification. GTDB-Tk serves as a basis for future GTDB releases and is available as an online resource through KBase. The tool is a valuable resource for the research community to classify microbial genomes from metagenomic datasets.GTDB-Tk is a computational tool for classifying bacterial and archaeal genomes using the Genome Taxonomy Database (GTDB). It efficiently assigns taxonomic ranks based on reference trees and uses relative evolutionary divergence (RED) and average nucleotide identity (ANI) to determine classifications. The tool was tested on 10,156 metagenome-assembled genomes (MAGs) and showed high accuracy, with 89.5% of classifications matching manual curation. GTDB-Tk provides more resolved classifications than GTDB in some cases and less in others. It classifies genomes by placing them in domain-specific reference trees and using RED and ANI to determine taxonomic ranks. The tool is implemented in Python and available online. GTDB-Tk is efficient, capable of classifying thousands of genomes in parallel, and is suitable for genomes with at least 50% completeness and ≤10% contamination. It is recommended for use in studies focused on evolutionary relationships or taxonomic reclassification. GTDB-Tk serves as a basis for future GTDB releases and is available as an online resource through KBase. The tool is a valuable resource for the research community to classify microbial genomes from metagenomic datasets.
Reach us at info@futurestudyspace.com