1 Feb 2024 | Xiaobao Wu, Fengjun Pan, Thong Nguyen, Yichao Feng, Chaoqun Liu, Cong-Duy Nguyen, Anh Tuan Luu
The paper "On the Affinity, Rationality, and Diversity of Hierarchical Topic Modeling" addresses the challenges of producing high-quality topic hierarchies in hierarchical topic modeling. It proposes a novel neural model called Transport Plan and Context-aware Hierarchical Topic Model (TraCo) to improve the affinity, rationality, and diversity of topic hierarchies. The key contributions of the paper are:
1. **Transport Plan Dependency (TPD)**: This method models dependencies between topics as optimal transport plans, ensuring sparsity and balance in the dependencies. It regularizes the building of topic hierarchies, improving the affinity between child and parent topics and the diversity of sibling topics.
2. **Context-aware Disentangled Decoder (CDD)**: This decoder decodes input documents using topics at each level individually, distributing different semantic granularities to topics at different levels. It enhances the rationality of hierarchies by ensuring that topics at each level cover different semantics from their contextual levels.
The paper evaluates TraCo on several benchmark datasets and demonstrates its effectiveness through extensive experiments. TraCo outperforms state-of-the-art baselines in terms of topic coherence, diversity, and rationality, showing better performance on downstream tasks such as text classification and clustering. The results highlight the improved quality of topic hierarchies generated by TraCo, making it a significant advancement in hierarchical topic modeling.The paper "On the Affinity, Rationality, and Diversity of Hierarchical Topic Modeling" addresses the challenges of producing high-quality topic hierarchies in hierarchical topic modeling. It proposes a novel neural model called Transport Plan and Context-aware Hierarchical Topic Model (TraCo) to improve the affinity, rationality, and diversity of topic hierarchies. The key contributions of the paper are:
1. **Transport Plan Dependency (TPD)**: This method models dependencies between topics as optimal transport plans, ensuring sparsity and balance in the dependencies. It regularizes the building of topic hierarchies, improving the affinity between child and parent topics and the diversity of sibling topics.
2. **Context-aware Disentangled Decoder (CDD)**: This decoder decodes input documents using topics at each level individually, distributing different semantic granularities to topics at different levels. It enhances the rationality of hierarchies by ensuring that topics at each level cover different semantics from their contextual levels.
The paper evaluates TraCo on several benchmark datasets and demonstrates its effectiveness through extensive experiments. TraCo outperforms state-of-the-art baselines in terms of topic coherence, diversity, and rationality, showing better performance on downstream tasks such as text classification and clustering. The results highlight the improved quality of topic hierarchies generated by TraCo, making it a significant advancement in hierarchical topic modeling.