Abstract
Analyzing numerous or long time series is difficult in practice due to the high storage costs and computational requirements. Therefore, techniques have been proposed to generate compact similarity-preserving representations of time series, enabling real-time similarity search on large in-memory data collections. However, the existing techniques are not ideally suited for assessing similarity when sequences are locally out of phase. In this paper, we propose the use of product quantization for efficient similarity-based comparison of time series under time warping. The idea is to first compress the data by partitioning the time series into equal length sub-sequences which are represented by a short code. The distance between two time series can then be efficiently approximated by pre-computed elastic distances between their codes. The partitioning into sub-sequences forces unwanted alignments, which we address with a pre-alignment step using the maximal overlap discrete wavelet transform (MODWT). To demonstrate the efficiency and accuracy of our method, we perform an extensive experimental evaluation on benchmark datasets in nearest neighbors classification and clustering applications. Overall, the proposed solution emerges as a highly efficient (both in terms of memory usage and computation time) replacement for elastic measures in time series applications.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Intel Core i7-2600 CPU @ 3.40 GHz; 15 Gb of memory; Ubuntu GNU/Linux 18.04.
- 2.
- 3.
We use tslearn v0.5.0.5. See https://tslearn.readthedocs.io.
- 4.
Only the datasets available since 2018 [4] were used to keep the runtime of the experiments manageable, while achieving a maximal overlap with existing research.
References
Akiba, T., Sano, S., Yanase, T., Ohta, T., Koyama, M.: Optuna: a next-generation hyperparameter optimization framework. In: Proceedings of the 25th ACM SIGKDD International Conference Knowledge Discovery and Data Mining (2019)
Chan, F.P., Fu, A.C., Yu, C.: Haar wavelets for efficient similarity search of time-series: with and without time warping. IEEE Trans. Knowl. Data Eng. 15(3), 686–705 (2003)
Chan, K.P., Fu, A.W.C.: Efficient time series matching by wavelets. In: Proceedings 15th International Conference on Data Engineering. ICDE 99, p. 126. IEEE Computer Society, USA (1999)
Dau, H.A., et al.: The UCR time series classification archive (2018). https://www.cs.ucr.edu/~eamonn/time_series_data_2018/
Ding, H., Trajcevski, G., Scheuermann, P., Wang, X., Keogh, E.: Querying and mining of time series data: experimental comparison of representations and distance measures. Proc. VLDB Endowment 1(2), 1542–1552 (2008)
Faloutsos, C., Ranganathan, M., Manolopoulos, Y.: Fast subsequence matching in time-series databases. In: Proceedings of the 1994 ACM SIGMOD International Conference on Management of Data. SIGMOD 94, pp. 419–429. ACM Press, New York (1994)
Hong, J.Y., Park, S.H., Baek, J.G.: SSDTW: shape segment dynamic time warping. Expert Syst. Appl. 150, 113291 (2020)
Jegou, H., Douze, M., Schmid, C.: Product quantization for nearest neighbor search. IEEE Trans. Pattern Anal. Mach. Intell. 33(1), 117–128 (2010)
Keogh, E., Chakrabarti, K., Pazzani, M., Mehrotra, S.: Dimensionality reduction for fast similarity search in large time series databases. Knowl. Inf. Syst. 3(3), 263–286 (2001)
Keogh, E., Ratanamahatana, C.A.: Exact indexing of dynamic time warping. Knowl. Inf. Syst. 7(3), 358–386 (2005)
Kim, S.W., Park, S., Chu, W.W.: An index-based approach for similarity search supporting time warping in large sequence databases. In: Proceedings 17th International Conference on Data Engineering, pp. 607–614. IEEE (2001)
Lemire, D.: Faster retrieval with a two-pass dynamic-time-warping lower bound. Pattern Recognit. 42(9), 2169–2180 (2009)
Lin, J., Keogh, E., Wei, L., Lonardi, S.: Experiencing sax: a novel symbolic representation of time series. Data Min. Knowl. Disc. 15(2), 107–144 (2007)
Meert, W., Hendrickx, K., Van Craenendonck, T., Robberechts, P.: DTAIDistance (2022). https://doi.org/10.5281/zenodo.3981067https://github.com/wannesm/dtaidistance
Mueen, A., Keogh, E.: Extracting optimal performance from dynamic time warping. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. KDD 2016, pp. 2129–2130. ACM Press, New York (2016)
Nguyen, T.L., Gsponer, S., Ifrim, G.: Time series classification by sequence learning in all-subsequence space. In: 2017 IEEE 33rd International Conference on Data Engineering (ICDE). ICDE 1, pp. 947–958(2017)
Paparrizos, J., Gravano, L.: k-Shape: efficient and accurate clustering of time series. In: Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, pp. 1855–1870 (2015)
Paparrizos, J., Liu, C., Elmore, A.J., Franklin, M.J.: Debunking four long-standing misconceptions of time-series distance measures. In: Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data. SIGMOD 20, pp. 1887–1905. ACM Press, New York (2020). https://doi.org/10.1145/3318464.3389760
Petitjean, F., Ketterlin, A., Gançarski, P.: A global averaging method for dynamic time warping, with applications to clustering. Pattern Recognit. 44(3), 678–693 (2011)
Rakthanmanon, T., et al.: Searching and mining trillions of time series subsequences under dynamic time warping. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 262–270. ACM Press, New York (2012)
Rand, W.M.: Objective criteria for the evaluation of clustering methods. J. Am. Stat. Assoc. 66(336), 846–850 (1971). https://www.jstor.org/stable/2284239
Sakoe, H., Chiba, S.: Dynamic programming algorithm optimization for spoken word recognition. IEEE Trans. Signal Process. 26(1), 43–49 (1978)
Salvador, S., Chan, P.: Toward accurate dynamic time warping in linear time and space. Intell. Data Anal. 11(5), 561–580 (2007)
Shen, Y., Chen, Y., Keogh, E., Jin, H.: Accelerating time series searching with large uniform scaling. In: Proceedings of the 2018 SIAM International Conference on Data Mining. SIAM Publications, pp. 234–242 (2018)
Silva, D.F., Batista, G.E.A.P.A.: Speeding up all-pairwise dynamic time warping matrix calculation. In: Proceedings of the 2016 SIAM International Conference on Data Mining, pp. 837–845. SIAM Publications (2016)
Spiegel, S., Jain, B.J., Albayrak, S.: Fast time series classification under lucky time warping distance. In: Proceedings of the 29th Annual ACM Symposium on Applied Computing, pp. 71–78 (2014)
Tan, C.W., Petitjean, F., Webb, G.I.: Elastic bands across the path: a new framework and method to lower bound DTW. In: Proceedings of the 2019 SIAM International Conference on Data Mining, pp. 522–530. SIAM (2019)
Wu, R., Keogh, E.J.: FastDTW is approximate and generally slower than the algorithm it approximates. IEEE Trans. Knowl. Data Eng. (2020)
Zhang, H., Dong, Y., Li, J., Xu, D.: Dynamic time warping under product quantization, with applications to time series data similarity search. IEEE IoT-J, 1 (2021). https://doi.org/10.1109/JIOT.2021.3132017
Acknowledgements
This work was partially supported by iBOF/21/075, the KU Leuven Research Fund (C14/17/070), VLAIO ICON-AI Conscious, and the Flemish Government under the “Onderzoeksprogramma Artificiële Intelligentie (AI) Vlaanderen” program.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Robberechts, P., Meert, W., Davis, J. (2022). Elastic Product Quantization for Time Series. In: Pascal, P., Ienco, D. (eds) Discovery Science. DS 2022. Lecture Notes in Computer Science(), vol 13601. Springer, Cham. https://doi.org/10.1007/978-3-031-18840-4_12
Download citation
DOI: https://doi.org/10.1007/978-3-031-18840-4_12
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-18839-8
Online ISBN: 978-3-031-18840-4
eBook Packages: Computer ScienceComputer Science (R0)