2024-10-26 | Mingtian Tan, Mike A. Merrill, Vinayak Gupta, Tim Althoff, Thomas Hartvigsen
Are Language Models Actually Useful for Time Series Forecasting?
This paper investigates whether large language models (LLMs) are beneficial for time series forecasting. Through a series of ablation studies on three recent LLM-based forecasting methods, we find that removing the LLM component or replacing it with a basic attention layer does not degrade forecasting performance—in most cases, results even improve. Pretrained LLMs do not outperform models trained from scratch, do not represent sequential dependencies in time series, and do not assist in few-shot settings. We also explore time series encoders and find that patching and attention structures perform similarly to LLM-based forecasters.
The study evaluates three state-of-the-art methods for time series forecasting: OneFitsAll, Time-LLM, and CALF. The results show that ablation methods, which remove or replace the LLM component, perform as well or better than the original LLM-based methods. These simpler methods also significantly reduce training and inference time. Additionally, the study finds that LLMs do not transfer sequence modeling abilities from text to time series and do not help in few-shot settings.
The paper concludes that while LLMs are popular in time series forecasting, they do not provide significant benefits for the task. Instead, simpler models with patching and attention structures can achieve similar performance. The findings suggest that time series methods using large language models are better suited for multimodal applications that require textual reasoning. The study highlights the need for further research into the potential of LLMs in time series forecasting and the importance of evaluating their effectiveness in real-world applications.Are Language Models Actually Useful for Time Series Forecasting?
This paper investigates whether large language models (LLMs) are beneficial for time series forecasting. Through a series of ablation studies on three recent LLM-based forecasting methods, we find that removing the LLM component or replacing it with a basic attention layer does not degrade forecasting performance—in most cases, results even improve. Pretrained LLMs do not outperform models trained from scratch, do not represent sequential dependencies in time series, and do not assist in few-shot settings. We also explore time series encoders and find that patching and attention structures perform similarly to LLM-based forecasters.
The study evaluates three state-of-the-art methods for time series forecasting: OneFitsAll, Time-LLM, and CALF. The results show that ablation methods, which remove or replace the LLM component, perform as well or better than the original LLM-based methods. These simpler methods also significantly reduce training and inference time. Additionally, the study finds that LLMs do not transfer sequence modeling abilities from text to time series and do not help in few-shot settings.
The paper concludes that while LLMs are popular in time series forecasting, they do not provide significant benefits for the task. Instead, simpler models with patching and attention structures can achieve similar performance. The findings suggest that time series methods using large language models are better suited for multimodal applications that require textual reasoning. The study highlights the need for further research into the potential of LLMs in time series forecasting and the importance of evaluating their effectiveness in real-world applications.