High throughput ANI analysis of 90K prokaryotic genomes reveals clear species boundaries

High throughput ANI analysis of 90K prokaryotic genomes reveals clear species boundaries

2018 | Chirag Jain, Luis M. Rodriguez-R, Adam M. Phillippy, Konstantinos T. Konstantinidis & Srinivas Aluru
A study using FastANI, a new method for estimating Average Nucleotide Identity (ANI), analyzed 90,000 prokaryotic genomes to determine genetic boundaries. The results showed clear genetic discontinuity, with 99.8% of genome pairs having ANI values above 95% within species and below 83% between species. FastANI is accurate for both complete and draft genomes and is significantly faster than alignment-based methods. It was used to compute pairwise ANI values for all prokaryotic genomes in the NCBI database, revealing a strong bimodal distribution of ANI values, with a wide gap between the two peaks. This discontinuity was consistent across different time periods and was robust to additions to the database. The study also showed that the 95% ANI threshold is a valid classifier for species boundaries, with high accuracy in identifying species. FastANI is efficient and can process large datasets quickly, making it a valuable tool for analyzing microbial genomes. The results suggest that prokaryotic species are distinct, with clear boundaries, and that ANI is a reliable method for defining species. The study highlights the importance of accurate species definition for microbial ecology and evolution, as well as for practical applications such as disease diagnosis and regulation of organism transport. The findings support the use of ANI as a standard measure of relatedness, replacing DNA-DNA hybridization. The study also emphasizes the need for improved bioinformatics tools to handle the growing volume of genomic data.A study using FastANI, a new method for estimating Average Nucleotide Identity (ANI), analyzed 90,000 prokaryotic genomes to determine genetic boundaries. The results showed clear genetic discontinuity, with 99.8% of genome pairs having ANI values above 95% within species and below 83% between species. FastANI is accurate for both complete and draft genomes and is significantly faster than alignment-based methods. It was used to compute pairwise ANI values for all prokaryotic genomes in the NCBI database, revealing a strong bimodal distribution of ANI values, with a wide gap between the two peaks. This discontinuity was consistent across different time periods and was robust to additions to the database. The study also showed that the 95% ANI threshold is a valid classifier for species boundaries, with high accuracy in identifying species. FastANI is efficient and can process large datasets quickly, making it a valuable tool for analyzing microbial genomes. The results suggest that prokaryotic species are distinct, with clear boundaries, and that ANI is a reliable method for defining species. The study highlights the importance of accurate species definition for microbial ecology and evolution, as well as for practical applications such as disease diagnosis and regulation of organism transport. The findings support the use of ANI as a standard measure of relatedness, replacing DNA-DNA hybridization. The study also emphasizes the need for improved bioinformatics tools to handle the growing volume of genomic data.
Reach us at info@study.space