Using Information Content to Evaluate Semantic Similarity in a Taxonomy

Using Information Content to Evaluate Semantic Similarity in a Taxonomy

29 Nov 1995 | Philip Resnik*
This paper introduces a new measure of semantic similarity in an IS-A taxonomy, based on information content. The measure is evaluated against human similarity judgments and performs well, with a correlation of \( r = 0.79 \) compared to \( r = 0.66 \) for the traditional edge-counting approach. The method avoids the issue of varying link distances in taxonomies and provides a way to adapt static knowledge structures to multiple contexts. The evaluation uses WordNet's taxonomy and human ratings from Miller and Charles (1991). The measure is defined as the negative log likelihood of encountering an instance of a concept, quantifying the shared information between two concepts. The paper also discusses related work and potential improvements, such as considering all concepts that dominate both words rather than just the most informative class.This paper introduces a new measure of semantic similarity in an IS-A taxonomy, based on information content. The measure is evaluated against human similarity judgments and performs well, with a correlation of \( r = 0.79 \) compared to \( r = 0.66 \) for the traditional edge-counting approach. The method avoids the issue of varying link distances in taxonomies and provides a way to adapt static knowledge structures to multiple contexts. The evaluation uses WordNet's taxonomy and human ratings from Miller and Charles (1991). The measure is defined as the negative log likelihood of encountering an instance of a concept, quantifying the shared information between two concepts. The paper also discusses related work and potential improvements, such as considering all concepts that dominate both words rather than just the most informative class.
Reach us at info@study.space