21 Jun 2018 | Minjoon Seo1*, Aniruddha Kembhavi2 Ali Farhadi1,2 Hananneh Hajishirzi1
This paper introduces Bi-Directional Attention Flow (BiDAF), a multi-stage hierarchical model for machine comprehension (MC) that uses bi-directional attention flow to obtain a query-aware context representation without early summarization. The model consists of six layers: character embedding, word embedding, contextual embedding, attention flow, modeling, and output. The attention flow layer computes attention in both directions (context-to-query and query-to-context) and allows the attended vectors to flow through to the subsequent modeling layer, reducing information loss. The model outperforms previous approaches on the Stanford Question Answering Dataset (SQuAD) and CNN/DailyMail cloze test. The model's performance is evaluated on SQuAD and CNN/DailyMail datasets, showing state-of-the-art results. The model's ablation study demonstrates the importance of each component. The model is also tested on cloze-style reading comprehension tasks, achieving strong results. The paper also discusses related work in machine comprehension and visual question answering, and provides error analysis and visualization of the model's performance. The model's architecture is compared to other approaches, and the results show that the bi-directional attention mechanism improves the model's ability to answer complex questions by attending to relevant parts of the context.This paper introduces Bi-Directional Attention Flow (BiDAF), a multi-stage hierarchical model for machine comprehension (MC) that uses bi-directional attention flow to obtain a query-aware context representation without early summarization. The model consists of six layers: character embedding, word embedding, contextual embedding, attention flow, modeling, and output. The attention flow layer computes attention in both directions (context-to-query and query-to-context) and allows the attended vectors to flow through to the subsequent modeling layer, reducing information loss. The model outperforms previous approaches on the Stanford Question Answering Dataset (SQuAD) and CNN/DailyMail cloze test. The model's performance is evaluated on SQuAD and CNN/DailyMail datasets, showing state-of-the-art results. The model's ablation study demonstrates the importance of each component. The model is also tested on cloze-style reading comprehension tasks, achieving strong results. The paper also discusses related work in machine comprehension and visual question answering, and provides error analysis and visualization of the model's performance. The model's architecture is compared to other approaches, and the results show that the bi-directional attention mechanism improves the model's ability to answer complex questions by attending to relevant parts of the context.