15 February 2017 | Seok-Hwan Yoon · Sung-min Ha · Jeongmin Lim · Soonjae Kwon · Jongsik Chun
This study evaluates four algorithms for calculating average nucleotide identity (ANI), a method used to define species boundaries in Archaea and Bacteria. The four algorithms are ANIb (using BLAST), ANIm (using MUMmer), OrthoANIb (using BLAST), and OrthoANIu (using USEARCH). The evaluation was performed on over 100,000 genome pairs with varying sizes. The results showed that OrthoANIb and OrthoANIu had good correlation with the standard ANIb method across the entire range of ANI values. ANIm showed poor correlation for ANI values below 90%. ANIm and OrthoANIu were significantly faster than ANIb, with run-times reduced by 53- and 22-fold, respectively, for genomes larger than 7 Mbp. The OrthoANIu method can greatly speed up ANI calculations without losing accuracy. A web service for calculating OrthoANIu is available at http://www.ezbiocloud.net/tools/ani. A standalone JAVA program for large-scale calculations is available at http://www.ezbiocloud.net/tools/orthoaniu. The study highlights the importance of evaluating ANI calculation tools for accuracy and computational efficiency, especially as the number of genome sequences continues to grow. The results suggest that OrthoANIu is a suitable method for large-scale ANI calculations in bioinformatics pipelines.This study evaluates four algorithms for calculating average nucleotide identity (ANI), a method used to define species boundaries in Archaea and Bacteria. The four algorithms are ANIb (using BLAST), ANIm (using MUMmer), OrthoANIb (using BLAST), and OrthoANIu (using USEARCH). The evaluation was performed on over 100,000 genome pairs with varying sizes. The results showed that OrthoANIb and OrthoANIu had good correlation with the standard ANIb method across the entire range of ANI values. ANIm showed poor correlation for ANI values below 90%. ANIm and OrthoANIu were significantly faster than ANIb, with run-times reduced by 53- and 22-fold, respectively, for genomes larger than 7 Mbp. The OrthoANIu method can greatly speed up ANI calculations without losing accuracy. A web service for calculating OrthoANIu is available at http://www.ezbiocloud.net/tools/ani. A standalone JAVA program for large-scale calculations is available at http://www.ezbiocloud.net/tools/orthoaniu. The study highlights the importance of evaluating ANI calculation tools for accuracy and computational efficiency, especially as the number of genome sequences continues to grow. The results suggest that OrthoANIu is a suitable method for large-scale ANI calculations in bioinformatics pipelines.