A SICK cure for the evaluation of compositional distributional semantic models

A SICK cure for the evaluation of compositional distributional semantic models

| M. Marelli, S. Menini, M. Baroni, L. Bentivogli, R. Bernardi, R. Zamparelli
The paper introduces SICK (Sentences Involving Compositional Knowledge), a large-scale English benchmark dataset designed to evaluate compositional distributional semantic models (CDSMs). SICK consists of approximately 10,000 English sentence pairs that include various lexical, syntactic, and semantic phenomena relevant to CDSMs but avoid aspects like idiomatic expressions, named entities, and telegraphic language. Each pair is annotated for two key semantic tasks: relatedness in meaning (on a 5-point scale) and entailment relation (three labels: entailment, contradiction, and neutral). The dataset was created using a 3-step process involving sentence normalization, expansion, and pairing, and it was annotated through crowdsourcing on the CrowdFlower platform. The paper discusses the background of existing datasets and the rationale behind SICK's design, highlighting its suitability for evaluating CDSMs. The dataset was used in SemEval-2014 Task 1 and is freely available for research purposes.The paper introduces SICK (Sentences Involving Compositional Knowledge), a large-scale English benchmark dataset designed to evaluate compositional distributional semantic models (CDSMs). SICK consists of approximately 10,000 English sentence pairs that include various lexical, syntactic, and semantic phenomena relevant to CDSMs but avoid aspects like idiomatic expressions, named entities, and telegraphic language. Each pair is annotated for two key semantic tasks: relatedness in meaning (on a 5-point scale) and entailment relation (three labels: entailment, contradiction, and neutral). The dataset was created using a 3-step process involving sentence normalization, expansion, and pairing, and it was annotated through crowdsourcing on the CrowdFlower platform. The paper discusses the background of existing datasets and the rationale behind SICK's design, highlighting its suitability for evaluating CDSMs. The dataset was used in SemEval-2014 Task 1 and is freely available for research purposes.
Reach us at info@study.space
Understanding A SICK cure for the evaluation of compositional distributional semantic models