[slides and audio] Dense Passage Retrieval for Open-Domain Question Answering

This paper addresses the challenge of open-domain question answering (QA), focusing on improving the efficiency and effectiveness of passage retrieval. Traditional sparse vector space models like TF-IDF and BM25 are widely used for this task, but the authors propose a novel approach using dense representations. They develop a Dense Passage Retriever (DPR) that leverages a dual-encoder framework to learn dense embeddings from a small number of questions and passages. DPR outperforms traditional methods by 9%-19% in terms of top-20 passage retrieval accuracy and significantly enhances the performance of end-to-end QA systems on multiple benchmarks. The key contributions include demonstrating that dense retrieval can outperform sparse methods without additional pretraining and showing that higher retrieval precision translates to better end-to-end QA accuracy. The paper also discusses the training setup, model architecture, and experimental results, highlighting the effectiveness of DPR in reducing the search space for answer extraction and improving the overall QA performance.This paper addresses the challenge of open-domain question answering (QA), focusing on improving the efficiency and effectiveness of passage retrieval. Traditional sparse vector space models like TF-IDF and BM25 are widely used for this task, but the authors propose a novel approach using dense representations. They develop a Dense Passage Retriever (DPR) that leverages a dual-encoder framework to learn dense embeddings from a small number of questions and passages. DPR outperforms traditional methods by 9%-19% in terms of top-20 passage retrieval accuracy and significantly enhances the performance of end-to-end QA systems on multiple benchmarks. The key contributions include demonstrating that dense retrieval can outperform sparse methods without additional pretraining and showing that higher retrieval precision translates to better end-to-end QA accuracy. The paper also discusses the training setup, model architecture, and experimental results, highlighting the effectiveness of DPR in reducing the search space for answer extraction and improving the overall QA performance.

Dense Passage Retrieval for Open-Domain Question Answering

November 16–20, 2020 | Vladimir Karpukhin, Barlas Oğuz, Sewon Min†, Patrick Lewis, Ledell Wu, Sergey Edunov, Danqi Chen‡, Wen-tau Yih