ABCNN: Attention-Based Convolutional Neural Network for Modeling Sentence Pairs

ABCNN: Attention-Based Convolutional Neural Network for Modeling Sentence Pairs

25 Jun 2018 | Wenpeng Yin, Hinrich Schütze, Bing Xiang, Bowen Zhou
This paper introduces ABCNN, an attention-based convolutional neural network for modeling sentence pairs. The ABCNN is designed to address the challenge of modeling pairs of sentences in tasks such as answer selection (AS), paraphrase identification (PI), and textual entailment (TE). The key contributions of ABCNN include: (1) its applicability to a wide range of sentence pair modeling tasks, (2) the integration of mutual influence between sentences into CNNs through three attention schemes, and (3) achieving state-of-the-art performance on AS, PI, and TE tasks. The ABCNN builds upon the BCNN (Basic Bi-CNN) architecture, which uses two weight-sharing CNNs to process each sentence and a final layer to solve the sentence pair task. The ABCNN introduces attention mechanisms that allow the model to focus on relevant parts of the sentences, improving the representation of each sentence by considering its counterpart. Three variants of ABCNN are proposed: ABCNN-1, ABCNN-2, and ABCNN-3, each introducing different attention mechanisms. ABCNN-1 applies attention at the input level, ABCNN-2 applies attention at the output level, and ABCNN-3 combines both. Experiments on AS, PI, and TE tasks show that ABCNN outperforms previous methods, with ABCNN-2 and ABCNN-3 achieving the best results. The ABCNN-3, which combines attention mechanisms at both input and output levels, achieves the highest performance. The results demonstrate that attention mechanisms significantly improve the model's ability to capture the relationships between sentences, leading to better performance in all three tasks. The ABCNN is also shown to be effective in tasks where linguistic features are used, indicating its general effectiveness in sentence pair modeling.This paper introduces ABCNN, an attention-based convolutional neural network for modeling sentence pairs. The ABCNN is designed to address the challenge of modeling pairs of sentences in tasks such as answer selection (AS), paraphrase identification (PI), and textual entailment (TE). The key contributions of ABCNN include: (1) its applicability to a wide range of sentence pair modeling tasks, (2) the integration of mutual influence between sentences into CNNs through three attention schemes, and (3) achieving state-of-the-art performance on AS, PI, and TE tasks. The ABCNN builds upon the BCNN (Basic Bi-CNN) architecture, which uses two weight-sharing CNNs to process each sentence and a final layer to solve the sentence pair task. The ABCNN introduces attention mechanisms that allow the model to focus on relevant parts of the sentences, improving the representation of each sentence by considering its counterpart. Three variants of ABCNN are proposed: ABCNN-1, ABCNN-2, and ABCNN-3, each introducing different attention mechanisms. ABCNN-1 applies attention at the input level, ABCNN-2 applies attention at the output level, and ABCNN-3 combines both. Experiments on AS, PI, and TE tasks show that ABCNN outperforms previous methods, with ABCNN-2 and ABCNN-3 achieving the best results. The ABCNN-3, which combines attention mechanisms at both input and output levels, achieves the highest performance. The results demonstrate that attention mechanisms significantly improve the model's ability to capture the relationships between sentences, leading to better performance in all three tasks. The ABCNN is also shown to be effective in tasks where linguistic features are used, indicating its general effectiveness in sentence pair modeling.
Reach us at info@study.space
Understanding ABCNN%3A Attention-Based Convolutional Neural Network for Modeling Sentence Pairs