[slides] Gender Bias in Coreference Resolution%3A Evaluation and Debiasing Methods

The paper introduces a new benchmark, WinoBias, designed to evaluate gender bias in coreference resolution systems. The benchmark consists of Winograd-schema style sentences with entities referred to by their occupations, focusing on gender stereotypes. The authors demonstrate that rule-based, feature-rich, and neural coreference systems all exhibit significant bias, linking gendered pronouns to pro-stereotypical entities more accurately than anti-stereotypical ones, with an average F1 score difference of 21.1. They propose a data-augmentation approach that combines with existing word-embedding debiasing techniques to remove this bias without significantly affecting performance on existing coreference datasets. The dataset and code are available at http://winobias.org. The paper also analyzes the training corpus, Ontonotes 5.0, and finds that it contains a significant underrepresentation of female entities, contributing to the bias. The authors propose a method to generate an auxiliary dataset by swapping male and female entities, which, when combined with debiasing techniques, effectively eliminates bias in WinoBias while maintaining coreference accuracy.The paper introduces a new benchmark, WinoBias, designed to evaluate gender bias in coreference resolution systems. The benchmark consists of Winograd-schema style sentences with entities referred to by their occupations, focusing on gender stereotypes. The authors demonstrate that rule-based, feature-rich, and neural coreference systems all exhibit significant bias, linking gendered pronouns to pro-stereotypical entities more accurately than anti-stereotypical ones, with an average F1 score difference of 21.1. They propose a data-augmentation approach that combines with existing word-embedding debiasing techniques to remove this bias without significantly affecting performance on existing coreference datasets. The dataset and code are available at http://winobias.org. The paper also analyzes the training corpus, Ontonotes 5.0, and finds that it contains a significant underrepresentation of female entities, contributing to the bias. The authors propose a method to generate an auxiliary dataset by swapping male and female entities, which, when combined with debiasing techniques, effectively eliminates bias in WinoBias while maintaining coreference accuracy.

Gender Bias in Coreference Resolution: Evaluation and Debiasing Methods

18 Apr 2018 | Jieyu Zhao, Tianlu Wang, Mark Yatskar, Vicente Ordonez, Kai-Wei Chang