Why is Tanimoto index an appropriate choice for fingerprint-based similarity calculations?

Why is Tanimoto index an appropriate choice for fingerprint-based similarity calculations?

2015 | Dávid Bajusz¹, Anita Rácz²³ and Károly Héberger²*
This study compares eight similarity/distance metrics for molecular fingerprint-based similarity calculations using sum of ranking differences (SRD) and ANOVA analysis. The goal is to determine which metrics produce rankings closest to the average of all metrics. The Tanimoto index, Dice index, Cosine coefficient, and Soergel distance were identified as the best metrics, as they produce rankings closest to the average. The similarity metrics derived from Euclidean and Manhattan distances are not recommended on their own, although they may be useful in certain cases like data fusion. The study also evaluates the effects of molecular size, selection method, and data pretreatment on the ranking behavior of the metrics. The results show that the Tanimoto index is a reliable and effective choice for similarity calculations, especially when no prior knowledge about the compounds is available. The study concludes that the Tanimoto index is a general approach that performs well across various scenarios and is suitable for molecular similarity calculations. The findings suggest that the Tanimoto index is a preferred choice for similarity calculations due to its consistency and reliability.This study compares eight similarity/distance metrics for molecular fingerprint-based similarity calculations using sum of ranking differences (SRD) and ANOVA analysis. The goal is to determine which metrics produce rankings closest to the average of all metrics. The Tanimoto index, Dice index, Cosine coefficient, and Soergel distance were identified as the best metrics, as they produce rankings closest to the average. The similarity metrics derived from Euclidean and Manhattan distances are not recommended on their own, although they may be useful in certain cases like data fusion. The study also evaluates the effects of molecular size, selection method, and data pretreatment on the ranking behavior of the metrics. The results show that the Tanimoto index is a reliable and effective choice for similarity calculations, especially when no prior knowledge about the compounds is available. The study concludes that the Tanimoto index is a general approach that performs well across various scenarios and is suitable for molecular similarity calculations. The findings suggest that the Tanimoto index is a preferred choice for similarity calculations due to its consistency and reliability.
Reach us at info@study.space