August 12-16, 2012, Beijing, China | Thanawin Rakthanmanon, Bilson Campana, Abdullah Mueen, Gustavo Batista, Brandon Westover, Qiang Zhu, Jesin Zakaria, Eamonn Keogh
This paper presents a novel approach to efficiently search and mine massive time series data using Dynamic Time Warping (DTW). The authors demonstrate that their method, called the UCR suite, can significantly outperform existing techniques, even on extremely large datasets. They show that DTW-based similarity search can be faster than Euclidean distance-based methods, which is counterintuitive given the common belief that DTW is computationally expensive. The experiments conducted on the largest time series datasets ever attempted show that the UCR suite can handle datasets larger than the combined size of all previously considered time series datasets in data mining literature. The authors also highlight the implications of their work for real-time monitoring of data streams, allowing for faster processing and the use of lower-powered devices. The paper discusses various optimizations, including early abandoning of distance calculations, lower bounding techniques, and the use of multicores. The UCR suite includes four original optimizations that significantly improve the efficiency of time series similarity search. The authors also show that their methods can be applied to a wide range of time series data mining problems, including motif discovery and clustering. The experiments demonstrate that the UCR suite can process massive datasets in a fraction of the time required by existing methods, even for very long queries. The paper concludes that the UCR suite is a powerful tool for time series data mining, capable of handling massive datasets and providing significant speedups over existing methods.This paper presents a novel approach to efficiently search and mine massive time series data using Dynamic Time Warping (DTW). The authors demonstrate that their method, called the UCR suite, can significantly outperform existing techniques, even on extremely large datasets. They show that DTW-based similarity search can be faster than Euclidean distance-based methods, which is counterintuitive given the common belief that DTW is computationally expensive. The experiments conducted on the largest time series datasets ever attempted show that the UCR suite can handle datasets larger than the combined size of all previously considered time series datasets in data mining literature. The authors also highlight the implications of their work for real-time monitoring of data streams, allowing for faster processing and the use of lower-powered devices. The paper discusses various optimizations, including early abandoning of distance calculations, lower bounding techniques, and the use of multicores. The UCR suite includes four original optimizations that significantly improve the efficiency of time series similarity search. The authors also show that their methods can be applied to a wide range of time series data mining problems, including motif discovery and clustering. The experiments demonstrate that the UCR suite can process massive datasets in a fraction of the time required by existing methods, even for very long queries. The paper concludes that the UCR suite is a powerful tool for time series data mining, capable of handling massive datasets and providing significant speedups over existing methods.