Chronos: Learning the Language of Time Series

Chronos: Learning the Language of Time Series

2 May 2024 | Abdul Fatir Ansari, Lorenzo Stella, Caner Turkmen, Xiyuan Zhang, Pedro Mercado, Huibin Shen, Oleksandr Shchur, Syama Sundar Rangapuram, Sebastian Pineda Arango, Shubham Kapoor, Jasper Zschiegner, Danielle C. Maddix, Hao Wang, Michael W. Mahoney, Kari Torkkola, Andrew Gordon Wilson, Michael Bohlke-Schneider, Yuyang Wang
**Chronos: Learning the Language of Time Series** **Authors:** Abdul Fatir Ansari, Lorenzo Stella, Caner Turkmen, Xiyuan Zhang, Pedro Mercado, Huibin Shen, Oleksandr Shchur, Syama Sundar Rangapuram, Sebastian Pineda Arango, Shubham Kapoor, Jasper Zschiegner, Danielle C. Maddix, Hao Wang, Michael W. Mahoney, Kari Torkkola, Andrew Gordon Wilson, Michael Bohlke-Schneider, Yuyang Wang **Abstract:** Chronos is a framework for pretraining probabilistic time series models using transformer-based language models. It tokenizes time series values into a fixed vocabulary through scaling and quantization, and trains these models on cross-entropy loss. Chronos models, based on the T5 family (ranging from 20M to 710M parameters), were pre-trained on a large collection of public datasets and synthetic data generated via Gaussian processes. In a comprehensive benchmark of 42 datasets, Chronos models outperform other methods on training datasets and achieve competitive or superior zero-shot performance on new datasets. This demonstrates that Chronos models can leverage diverse time series data to improve zero-shot accuracy on unseen forecasting tasks, making them a viable tool for simplifying forecasting pipelines. **Introduction:** Time series forecasting is crucial in various domains, from retail to healthcare. Traditional methods like ARIMA and ETS have been complemented by deep learning techniques due to the availability of large, diverse datasets. However, deep forecasters often operate in a standard regime, training and predicting on the same dataset. Chronos aims to address this by adapting language models to time series forecasting. It tokenizes time series values into discrete bins through simple scaling and quantization, allowing off-the-shelf language models to be trained on this "language of time series." This approach is effective and efficient, showing promise for addressing a broad range of time series problems with minimal modifications. **Background and Related Work:** The paper discusses the background of time series forecasting and language models, including classical and deep learning methods, large language models (LLMs), and LLM-based forecasters. It also covers zero-shot forecasting and other time series tasks, highlighting the limitations of existing methods and the potential of Chronos. **Chronos: A Language Modeling Framework for Time Series:** Chronos tokenizes time series values into a fixed vocabulary through scaling and quantization. It trains language models on these tokenized sequences using cross-entropy loss. The framework is designed to be minimal, requiring no specific modifications to the model architecture or training procedure. Chronos models are probabilistic and can generate multiple realizations of the future by autoregressively sampling from the predicted distribution. **Data Augmentation:** To address the scarcity of public time series datasets, Chronos employs data augmentation techniques like TSMixup and KernelSynth. T**Chronos: Learning the Language of Time Series** **Authors:** Abdul Fatir Ansari, Lorenzo Stella, Caner Turkmen, Xiyuan Zhang, Pedro Mercado, Huibin Shen, Oleksandr Shchur, Syama Sundar Rangapuram, Sebastian Pineda Arango, Shubham Kapoor, Jasper Zschiegner, Danielle C. Maddix, Hao Wang, Michael W. Mahoney, Kari Torkkola, Andrew Gordon Wilson, Michael Bohlke-Schneider, Yuyang Wang **Abstract:** Chronos is a framework for pretraining probabilistic time series models using transformer-based language models. It tokenizes time series values into a fixed vocabulary through scaling and quantization, and trains these models on cross-entropy loss. Chronos models, based on the T5 family (ranging from 20M to 710M parameters), were pre-trained on a large collection of public datasets and synthetic data generated via Gaussian processes. In a comprehensive benchmark of 42 datasets, Chronos models outperform other methods on training datasets and achieve competitive or superior zero-shot performance on new datasets. This demonstrates that Chronos models can leverage diverse time series data to improve zero-shot accuracy on unseen forecasting tasks, making them a viable tool for simplifying forecasting pipelines. **Introduction:** Time series forecasting is crucial in various domains, from retail to healthcare. Traditional methods like ARIMA and ETS have been complemented by deep learning techniques due to the availability of large, diverse datasets. However, deep forecasters often operate in a standard regime, training and predicting on the same dataset. Chronos aims to address this by adapting language models to time series forecasting. It tokenizes time series values into discrete bins through simple scaling and quantization, allowing off-the-shelf language models to be trained on this "language of time series." This approach is effective and efficient, showing promise for addressing a broad range of time series problems with minimal modifications. **Background and Related Work:** The paper discusses the background of time series forecasting and language models, including classical and deep learning methods, large language models (LLMs), and LLM-based forecasters. It also covers zero-shot forecasting and other time series tasks, highlighting the limitations of existing methods and the potential of Chronos. **Chronos: A Language Modeling Framework for Time Series:** Chronos tokenizes time series values into a fixed vocabulary through scaling and quantization. It trains language models on these tokenized sequences using cross-entropy loss. The framework is designed to be minimal, requiring no specific modifications to the model architecture or training procedure. Chronos models are probabilistic and can generate multiple realizations of the future by autoregressively sampling from the predicted distribution. **Data Augmentation:** To address the scarcity of public time series datasets, Chronos employs data augmentation techniques like TSMixup and KernelSynth. T
Reach us at info@study.space
Understanding Chronos%3A Learning the Language of Time Series