[slides and audio] A Model-Theoretic Coreference Scoring Scheme

This paper introduces a model-theoretic scoring scheme for the coreference task in MUC6, improving upon the original approach by grounding the scoring in a model, producing more intuitive recall and precision scores, and avoiding explicit computation of the transitive closure of coreference. The scheme compares equivalence classes defined by links in the key and response, focusing on identity links. The scores are determined by the minimal perturbations required to align the response's equivalence classes with those of the key. The paper discusses a problematic case where the response induces two equivalence classes, leading to a recall score of 2/2 and a precision score of 2/3, which aligns with intuitive expectations. The model-theoretic approach is shown to be computationally effective, with a simple counting scheme for calculating recall and precision scores. The scoring procedure for recall involves forming the equivalence sets generated by the key and determining how the response partitions these sets. The recall score is calculated as the ratio of the number of missing links to the minimal number of correct links needed to form the equivalence class. For precision, the process is reversed, adding links to the key to match the response's equivalence classes. The paper also provides examples with more complexity, including cases with multiple key and response classes, and demonstrates that the model-theoretic scores align with intuitive expectations. Computational considerations are briefly explored, highlighting the efficiency of the model-theoretic approach compared to the original syntactic scoring procedure.This paper introduces a model-theoretic scoring scheme for the coreference task in MUC6, improving upon the original approach by grounding the scoring in a model, producing more intuitive recall and precision scores, and avoiding explicit computation of the transitive closure of coreference. The scheme compares equivalence classes defined by links in the key and response, focusing on identity links. The scores are determined by the minimal perturbations required to align the response's equivalence classes with those of the key. The paper discusses a problematic case where the response induces two equivalence classes, leading to a recall score of 2/2 and a precision score of 2/3, which aligns with intuitive expectations. The model-theoretic approach is shown to be computationally effective, with a simple counting scheme for calculating recall and precision scores. The scoring procedure for recall involves forming the equivalence sets generated by the key and determining how the response partitions these sets. The recall score is calculated as the ratio of the number of missing links to the minimal number of correct links needed to form the equivalence class. For precision, the process is reversed, adding links to the key to match the response's equivalence classes. The paper also provides examples with more complexity, including cases with multiple key and response classes, and demonstrates that the model-theoretic scores align with intuitive expectations. Computational considerations are briefly explored, highlighting the efficiency of the model-theoretic approach compared to the original syntactic scoring procedure.

A Model-Theoretic Coreference Scoring Scheme

| Marc Vilain, John Burger, John Aberdeen, Dennis Connolly, Lynette Hirschman