On Faithfulness and Factuality in Abstractive Summarization

On Faithfulness and Factuality in Abstractive Summarization

2 May 2020 | Joshua Maynez*, Shashi Narayan*, Bernd Bohnet, Ryan McDonald
This paper investigates the limitations of neural text generation models in abstractive document summarization, particularly their tendency to hallucinate content that is unfaithful to the input document. Through a large-scale human evaluation, the authors found that all model-generated summaries contain substantial amounts of hallucinated content. However, they also observed that pre-trained models perform better in generating faithful and factual summaries, both in terms of raw metrics (ROUGE) and human judgments. The study further shows that textual entailment measures correlate better with faithfulness than standard metrics, suggesting potential for improved automatic evaluation metrics and training criteria. The paper concludes by discussing the challenges and future directions in achieving more faithful and factual abstractive summaries.This paper investigates the limitations of neural text generation models in abstractive document summarization, particularly their tendency to hallucinate content that is unfaithful to the input document. Through a large-scale human evaluation, the authors found that all model-generated summaries contain substantial amounts of hallucinated content. However, they also observed that pre-trained models perform better in generating faithful and factual summaries, both in terms of raw metrics (ROUGE) and human judgments. The study further shows that textual entailment measures correlate better with faithfulness than standard metrics, suggesting potential for improved automatic evaluation metrics and training criteria. The paper concludes by discussing the challenges and future directions in achieving more faithful and factual abstractive summaries.
Reach us at info@study.space