Why is Tanimoto index an appropriate choice for fingerprint-based similarity calculations?

Why is Tanimoto index an appropriate choice for fingerprint-based similarity calculations?

2015 | Dávid Bajusz, Anita Rácz, Károly Héberger
This study compares eight commonly used similarity metrics for molecular fingerprint-based similarity calculations, including the Tanimoto index, Dice index, Cosine coefficient, and Soergel distance. The comparison is conducted using a large dataset of molecular fingerprints and the Sum of Ranking Differences (SRD) method, along with ANOVA analysis. The effects of molecular size, selection methods, and data pretreatment methods on the metrics' performance are also assessed. The results show that the Tanimoto index, Dice index, Cosine coefficient, and Soergel distance are the best (and in some cases, equivalent) metrics, as they produce rankings closest to the average ranking of the eight metrics. The Euclidean and Manhattan distance metrics are found to be less optimal but may still be useful for data fusion due to their variability and diversity. The study concludes that these metrics provide reliable and consistent rankings, making them suitable for various chemoinformatic applications.This study compares eight commonly used similarity metrics for molecular fingerprint-based similarity calculations, including the Tanimoto index, Dice index, Cosine coefficient, and Soergel distance. The comparison is conducted using a large dataset of molecular fingerprints and the Sum of Ranking Differences (SRD) method, along with ANOVA analysis. The effects of molecular size, selection methods, and data pretreatment methods on the metrics' performance are also assessed. The results show that the Tanimoto index, Dice index, Cosine coefficient, and Soergel distance are the best (and in some cases, equivalent) metrics, as they produce rankings closest to the average ranking of the eight metrics. The Euclidean and Manhattan distance metrics are found to be less optimal but may still be useful for data fusion due to their variability and diversity. The study concludes that these metrics provide reliable and consistent rankings, making them suitable for various chemoinformatic applications.
Reach us at info@study.space
Understanding Why is Tanimoto index an appropriate choice for fingerprint-based similarity calculations%3F