TFB: Towards Comprehensive and Fair Benchmarking of Time Series Forecasting Methods

TFB: Towards Comprehensive and Fair Benchmarking of Time Series Forecasting Methods

2024 | Xiangfei Qiu, Jilin Hu, Lekui Zhou, Xingjian Wu, Junyang Du, Buang Zhang, Chenjuan Guo, Aoying Zhou, Christian S. Jensen, Zhenli Sheng, Bin Yang
The paper introduces TFB (Time Series Forecasting Benchmark), an automated benchmark designed to address the limitations of existing time series forecasting (TSF) evaluation frameworks. TFB aims to provide a comprehensive and fair evaluation of TSF methods by addressing three key issues: insufficient data domain coverage, stereotype bias against traditional methods, and inconsistent and inflexible evaluation pipelines. 1. **Insufficient Data Domain Coverage**: TFB includes datasets from 10 different domains (traffic, electricity, energy, environment, nature, economics, stock markets, banking, health, and web) to ensure a wide range of characteristics and settings. 2. **Stereotype Bias Against Traditional Methods**: TFB covers a diverse range of methods, including statistical learning, machine learning, and deep learning, to remove biases and enable more comprehensive evaluations. 3. **Inconsistent and Inflexible Pipelines**: TFB features a flexible and scalable pipeline that standardizes evaluation procedures, ensuring fair and accurate comparisons. TFB evaluates 21 univariate TSF methods on 8,068 univariate time series and 14 multivariate TSF methods on 25 datasets. The results highlight the strengths and limitations of different methods, providing insights for method selection based on specific datasets and settings. Key findings include: - Statistical methods like VAR and LinearRegression outperform recent state-of-the-art (SOTA) methods on some datasets. - Linear-based methods perform well on datasets with increasing trends or significant shifts. - Transformer-based methods outperform linear-based methods on datasets with marked seasonality, nonlinear patterns, and strong internal similarities. - Methods that consider dependencies between channels can significantly enhance performance on datasets with strong correlations. Overall, TFB provides a robust and user-friendly benchmark for researchers to evaluate and compare TSF methods, promoting the development of new and improved forecasting techniques.The paper introduces TFB (Time Series Forecasting Benchmark), an automated benchmark designed to address the limitations of existing time series forecasting (TSF) evaluation frameworks. TFB aims to provide a comprehensive and fair evaluation of TSF methods by addressing three key issues: insufficient data domain coverage, stereotype bias against traditional methods, and inconsistent and inflexible evaluation pipelines. 1. **Insufficient Data Domain Coverage**: TFB includes datasets from 10 different domains (traffic, electricity, energy, environment, nature, economics, stock markets, banking, health, and web) to ensure a wide range of characteristics and settings. 2. **Stereotype Bias Against Traditional Methods**: TFB covers a diverse range of methods, including statistical learning, machine learning, and deep learning, to remove biases and enable more comprehensive evaluations. 3. **Inconsistent and Inflexible Pipelines**: TFB features a flexible and scalable pipeline that standardizes evaluation procedures, ensuring fair and accurate comparisons. TFB evaluates 21 univariate TSF methods on 8,068 univariate time series and 14 multivariate TSF methods on 25 datasets. The results highlight the strengths and limitations of different methods, providing insights for method selection based on specific datasets and settings. Key findings include: - Statistical methods like VAR and LinearRegression outperform recent state-of-the-art (SOTA) methods on some datasets. - Linear-based methods perform well on datasets with increasing trends or significant shifts. - Transformer-based methods outperform linear-based methods on datasets with marked seasonality, nonlinear patterns, and strong internal similarities. - Methods that consider dependencies between channels can significantly enhance performance on datasets with strong correlations. Overall, TFB provides a robust and user-friendly benchmark for researchers to evaluate and compare TSF methods, promoting the development of new and improved forecasting techniques.
Reach us at info@study.space
[slides] TFB%3A Towards Comprehensive and Fair Benchmarking of Time Series Forecasting Methods | StudySpace