CALF: Aligning LLMs for Time Series Forecasting via Cross-modal Fine-Tuning

CALF: Aligning LLMs for Time Series Forecasting via Cross-modal Fine-Tuning

23 May 2024 | Peiyuan Liu, Hang Guo, Tao Dai, Naiqi Li, Jigang Bao, Xudong Ren, Yong Jiang, Shu-tao Xia
The paper "CALF: Aligning LLMs for Time Series Forecasting via Cross-modal Fine-tuning" addresses the challenge of aligning textual and temporal input tokens in large language models (LLMs) for multivariate time series forecasting (MTSF). The authors propose a novel framework called Cross-Modal LLM Fine-Tuning (CALF) to reduce the distribution discrepancy between textual and temporal data. CALF consists of two branches: a temporal target branch for processing time series information and a textual source branch for extracting and adapting information from pre-trained LLMs using aligned textual input. To bridge the modality gap, CALF employs three techniques: the Cross-Modal Match Module, Feature Regularization Loss, and Output Consistency Loss. These techniques ensure efficient alignment of input distributions, better gradient guidance, and consistent semantic context, respectively. Extensive experiments on various real-world datasets demonstrate that CALF achieves state-of-the-art performance in both long-term and short-term forecasting tasks, with favorable few/zero-shot generalization capabilities and low computational complexity. The paper also includes a probabilistic analysis to provide a theoretical foundation for the cross-modal fine-tuning techniques and discusses the limitations and future directions of the proposed method.The paper "CALF: Aligning LLMs for Time Series Forecasting via Cross-modal Fine-tuning" addresses the challenge of aligning textual and temporal input tokens in large language models (LLMs) for multivariate time series forecasting (MTSF). The authors propose a novel framework called Cross-Modal LLM Fine-Tuning (CALF) to reduce the distribution discrepancy between textual and temporal data. CALF consists of two branches: a temporal target branch for processing time series information and a textual source branch for extracting and adapting information from pre-trained LLMs using aligned textual input. To bridge the modality gap, CALF employs three techniques: the Cross-Modal Match Module, Feature Regularization Loss, and Output Consistency Loss. These techniques ensure efficient alignment of input distributions, better gradient guidance, and consistent semantic context, respectively. Extensive experiments on various real-world datasets demonstrate that CALF achieves state-of-the-art performance in both long-term and short-term forecasting tasks, with favorable few/zero-shot generalization capabilities and low computational complexity. The paper also includes a probabilistic analysis to provide a theoretical foundation for the cross-modal fine-tuning techniques and discusses the limitations and future directions of the proposed method.
Reach us at info@study.space