T.R. 93-86 | Christos Faloutsos, M. Ranganathan, Yannis Manolopoulos
This paper presents an efficient indexing method for subsequence matching in time-series databases. The method maps each data sequence into a small set of boxes (rectangles) in feature space, which can then be indexed using spatial access methods like the R*-tree. The approach involves sliding a window over the data sequence to extract features, dividing the resulting trail into sub-trails, and representing each sub-trail with a Minimum Bounding Rectangle (MBR). This allows for efficient querying by searching for MBRs that intersect with the query region in feature space. The method is dynamic, supporting insertions, deletions, and appending new measurements. It handles queries of varying lengths and ensures no false dismissals by leveraging the properties of the feature extraction function. Experiments on synthetic and real data show that the method significantly outperforms sequential scanning, achieving up to 100 times faster search times. The method is generalizable and can be applied to various data types, including 2D images and n-dimensional vector fields.This paper presents an efficient indexing method for subsequence matching in time-series databases. The method maps each data sequence into a small set of boxes (rectangles) in feature space, which can then be indexed using spatial access methods like the R*-tree. The approach involves sliding a window over the data sequence to extract features, dividing the resulting trail into sub-trails, and representing each sub-trail with a Minimum Bounding Rectangle (MBR). This allows for efficient querying by searching for MBRs that intersect with the query region in feature space. The method is dynamic, supporting insertions, deletions, and appending new measurements. It handles queries of varying lengths and ensures no false dismissals by leveraging the properties of the feature extraction function. Experiments on synthetic and real data show that the method significantly outperforms sequential scanning, achieving up to 100 times faster search times. The method is generalizable and can be applied to various data types, including 2D images and n-dimensional vector fields.