Understanding SimLex-999%3A Evaluating Semantic Models With (Genuine) Similarity Estimation

SimLex-999 is a gold standard resource designed to evaluate distributional semantic models by quantifying similarity rather than association. Unlike existing resources such as WordSim-353 and MEN, SimLex-999 explicitly measures the similarity of pairs of entities, distinguishing between concepts that are associated but not similar (e.g., "Freud" and "psychology") and those that are truly similar (e.g., "cup" and "mug"). This focus on similarity incentivizes the development of models with a broader range of applications. SimLex-999 includes a diverse set of concrete and abstract adjective, noun, and verb pairs, along with independent ratings for concreteness and association strength. This diversity enables fine-grained analyses of model performance on different types of concepts. State-of-the-art models perform below the inter-annotator agreement ceiling on SimLex-999, indicating significant room for improvement. The paper also explores how distributional models can be improved in similarity estimation, particularly through dependency parsing and larger context windows. Overall, SimLex-999 provides a valuable tool for advancing the field of distributional semantics by facilitating better-defined evaluations and more nuanced analyses.SimLex-999 is a gold standard resource designed to evaluate distributional semantic models by quantifying similarity rather than association. Unlike existing resources such as WordSim-353 and MEN, SimLex-999 explicitly measures the similarity of pairs of entities, distinguishing between concepts that are associated but not similar (e.g., "Freud" and "psychology") and those that are truly similar (e.g., "cup" and "mug"). This focus on similarity incentivizes the development of models with a broader range of applications. SimLex-999 includes a diverse set of concrete and abstract adjective, noun, and verb pairs, along with independent ratings for concreteness and association strength. This diversity enables fine-grained analyses of model performance on different types of concepts. State-of-the-art models perform below the inter-annotator agreement ceiling on SimLex-999, indicating significant room for improvement. The paper also explores how distributional models can be improved in similarity estimation, particularly through dependency parsing and larger context windows. Overall, SimLex-999 provides a valuable tool for advancing the field of distributional semantics by facilitating better-defined evaluations and more nuanced analyses.

SimLex-999: Evaluating Semantic Models With (Genuine) Similarity Estimation

15 Aug 2014 | Felix Hill, Roi Reichart, Anna Korhonen