TESTAM: A Time-Enhanced Spatio-Temporal Attention Model with Mixture of Experts

TESTAM: A Time-Enhanced Spatio-Temporal Attention Model with Mixture of Experts

2024 | Hyunwook Lee & Sungahn Ko*
TESTAM is a time-enhanced spatio-temporal attention model with a mixture-of-experts (MoE) architecture designed for accurate traffic forecasting. The model incorporates three experts for temporal modeling, spatio-temporal modeling with a static graph, and spatio-temporal dependency modeling with a dynamic graph. By introducing different experts and routing them appropriately, TESTAM effectively captures traffic patterns under various conditions, including spatially isolated roads, highly interconnected roads, and recurring and non-recurring events. The model reformulates the gating problem as a classification task with pseudo labels to achieve effective training of the gating network. Experimental results on three public traffic datasets (METR-LA, PEMS-BAY, and EXPY-TKY) show that TESTAM outperforms 13 existing methods in terms of accuracy due to its better modeling of recurring and non-recurring traffic patterns. The model's contributions include proposing a novel MoE model for traffic forecasting with diverse graph architectures, reformulating the gating problem as a classification task to better contextualize traffic situations, and demonstrating superior performance compared to existing methods. TESTAM achieves state-of-the-art results in traffic forecasting, particularly in large-scale graph structures, and provides qualitative results to visualize when and where specific graph structures are used. The model is efficient in terms of computational costs and shows superior performance in both training and inference phases.TESTAM is a time-enhanced spatio-temporal attention model with a mixture-of-experts (MoE) architecture designed for accurate traffic forecasting. The model incorporates three experts for temporal modeling, spatio-temporal modeling with a static graph, and spatio-temporal dependency modeling with a dynamic graph. By introducing different experts and routing them appropriately, TESTAM effectively captures traffic patterns under various conditions, including spatially isolated roads, highly interconnected roads, and recurring and non-recurring events. The model reformulates the gating problem as a classification task with pseudo labels to achieve effective training of the gating network. Experimental results on three public traffic datasets (METR-LA, PEMS-BAY, and EXPY-TKY) show that TESTAM outperforms 13 existing methods in terms of accuracy due to its better modeling of recurring and non-recurring traffic patterns. The model's contributions include proposing a novel MoE model for traffic forecasting with diverse graph architectures, reformulating the gating problem as a classification task to better contextualize traffic situations, and demonstrating superior performance compared to existing methods. TESTAM achieves state-of-the-art results in traffic forecasting, particularly in large-scale graph structures, and provides qualitative results to visualize when and where specific graph structures are used. The model is efficient in terms of computational costs and shows superior performance in both training and inference phases.
Reach us at info@study.space