A Structured Self-Attentive Sentence Embedding

A Structured Self-Attentive Sentence Embedding

9 Mar 2017 | Zhouhan Lin, Minwei Feng, Cicero Nogueira dos Santos, Mo Yu, Bing Xiang, Bowen Zhou & Yoshua Bengio
This paper proposes a new model for extracting interpretable sentence embeddings using self-attention. Instead of using a vector, the model represents the embedding as a 2D matrix, where each row attends to a different part of the sentence. A self-attention mechanism and a regularization term are introduced to enhance the model's performance. The embedding comes with an easy way to visualize which parts of the sentence are encoded. The model is evaluated on three tasks: author profiling, sentiment classification, and textual entailment, showing significant performance gains over other sentence embedding methods. The model uses a bidirectional LSTM to process the sentence and a self-attention mechanism to generate the embedding. The attention mechanism allows the model to focus on different aspects of the sentence, providing multiple vector representations. A penalization term is introduced to encourage diversity in the attention weights across different hops. This helps in reducing redundancy and improving the model's interpretability. The model is tested on three datasets: the Age dataset for author profiling, the Yelp dataset for sentiment analysis, and the SNLI corpus for textual entailment. Results show that the model outperforms baselines in all three tasks. The model's ability to generate interpretable embeddings is demonstrated through visualization techniques, which highlight the parts of the sentence that contribute to the embedding. The model's structure allows it to handle variable-length sentences and is scalable to longer texts. It is also efficient in terms of computation, with the penalization term requiring only one-third of the computation compared to KL divergence. The model's performance is validated through extensive experiments, showing that it achieves significant improvements in accuracy across different tasks. The model's ability to generate interpretable embeddings makes it a valuable tool for understanding and analyzing text data.This paper proposes a new model for extracting interpretable sentence embeddings using self-attention. Instead of using a vector, the model represents the embedding as a 2D matrix, where each row attends to a different part of the sentence. A self-attention mechanism and a regularization term are introduced to enhance the model's performance. The embedding comes with an easy way to visualize which parts of the sentence are encoded. The model is evaluated on three tasks: author profiling, sentiment classification, and textual entailment, showing significant performance gains over other sentence embedding methods. The model uses a bidirectional LSTM to process the sentence and a self-attention mechanism to generate the embedding. The attention mechanism allows the model to focus on different aspects of the sentence, providing multiple vector representations. A penalization term is introduced to encourage diversity in the attention weights across different hops. This helps in reducing redundancy and improving the model's interpretability. The model is tested on three datasets: the Age dataset for author profiling, the Yelp dataset for sentiment analysis, and the SNLI corpus for textual entailment. Results show that the model outperforms baselines in all three tasks. The model's ability to generate interpretable embeddings is demonstrated through visualization techniques, which highlight the parts of the sentence that contribute to the embedding. The model's structure allows it to handle variable-length sentences and is scalable to longer texts. It is also efficient in terms of computation, with the penalization term requiring only one-third of the computation compared to KL divergence. The model's performance is validated through extensive experiments, showing that it achieves significant improvements in accuracy across different tasks. The model's ability to generate interpretable embeddings makes it a valuable tool for understanding and analyzing text data.
Reach us at info@study.space
[slides] A Structured Self-attentive Sentence Embedding | StudySpace