High throughput ANI analysis of 90K prokaryotic genomes reveals clear species boundaries

High throughput ANI analysis of 90K prokaryotic genomes reveals clear species boundaries

(2018)9:514 | DOI: 10.1038/s41467-018-07641-9 | www.nature.com/naturecommunications | Chirag Jain1,2, Luis M. Rodriguez-R3,4, Adam M. Phillippy2, Konstantinos T. Konstantinidis3,4 & Srinivas Aluru1,5
The study by Chirag Jain, Luis M. Rodriguez-R, Adam M. Phillippy, Konstantinos T. Konstantinidis, and Srinivas Aluru presents FastANI, a novel method for estimating Average Nucleotide Identity (ANI) using alignment-free approximate sequence mapping. FastANI is significantly faster and more accurate than traditional alignment-based methods, making it suitable for large-scale analysis of prokaryotic genomes. The researchers used FastANI to compute pairwise ANI values among all prokaryotic genomes in the NCBI database, revealing a clear genetic discontinuity with 99.8% of the analyzed genome pairs showing intra-species ANI values >95% and inter-species ANI values <83%. This discontinuity is consistent across different datasets and robust to historical additions to the genome databases. The study also demonstrates that the 95% ANI threshold effectively demarcates species boundaries, with a recall frequency of 98.5% and a precision of 93.1% for named species. FastANI is expected to be useful for clinical and environmental microbial genome analysis, facilitating more accurate species definition and communication about prokaryotic species.The study by Chirag Jain, Luis M. Rodriguez-R, Adam M. Phillippy, Konstantinos T. Konstantinidis, and Srinivas Aluru presents FastANI, a novel method for estimating Average Nucleotide Identity (ANI) using alignment-free approximate sequence mapping. FastANI is significantly faster and more accurate than traditional alignment-based methods, making it suitable for large-scale analysis of prokaryotic genomes. The researchers used FastANI to compute pairwise ANI values among all prokaryotic genomes in the NCBI database, revealing a clear genetic discontinuity with 99.8% of the analyzed genome pairs showing intra-species ANI values >95% and inter-species ANI values <83%. This discontinuity is consistent across different datasets and robust to historical additions to the genome databases. The study also demonstrates that the 95% ANI threshold effectively demarcates species boundaries, with a recall frequency of 98.5% and a precision of 93.1% for named species. FastANI is expected to be useful for clinical and environmental microbial genome analysis, facilitating more accurate species definition and communication about prokaryotic species.
Reach us at info@study.space