[slides] Language Models Still Struggle to Zero-shot Reason about Time Series

Time series data are crucial for decision-making in various fields, but it remains unclear whether language models can reason about these data beyond simple forecasting. This study introduces a novel evaluation framework for time series reasoning, including formal tasks and a dataset of multi-scale time series paired with text captions across ten domains. The framework assesses three forms of reasoning: etiological reasoning (identifying the likely causes of a time series), question answering (answering factual questions about time series), and context-aided forecasting (improving forecasts using relevant textual context). Despite the advanced capabilities of current language models, they perform marginally above random on etiological and question answering tasks, and show modest success in using context to enhance forecasting. These findings highlight the need for further research to develop models that can deeply reason about time series data. The datasets and code used in this study are made public to support future research in this direction.Time series data are crucial for decision-making in various fields, but it remains unclear whether language models can reason about these data beyond simple forecasting. This study introduces a novel evaluation framework for time series reasoning, including formal tasks and a dataset of multi-scale time series paired with text captions across ten domains. The framework assesses three forms of reasoning: etiological reasoning (identifying the likely causes of a time series), question answering (answering factual questions about time series), and context-aided forecasting (improving forecasts using relevant textual context). Despite the advanced capabilities of current language models, they perform marginally above random on etiological and question answering tasks, and show modest success in using context to enhance forecasting. These findings highlight the need for further research to develop models that can deeply reason about time series data. The datasets and code used in this study are made public to support future research in this direction.

Language Models Still Struggle to Zero-shot Reason about Time Series

17 Apr 2024 | Mike A. Merrill, Mingtian Tan, Vinayak Gupta, Tom Hartvigsen, Tim Althoff