Abstract
Nearest neighbor search under elastic distances is a key tool for time series analysis, supporting many applications. However, straightforward implementations of distances require \(O(n^2)\) space and time complexities, preventing these applications from scaling to long series. Much work has been devoted to speeding up the NN search process, mostly with the development of lower bounds, allowing to avoid costly distance computations when a given threshold is exceeded. This threshold, provided by the similarity search process, also allows to early abandon the computation of a distance itself. Another approach, is to prune parts of the computation. All these techniques are orthogonal to each other. In this work, we develop a new generic strategy, “EAPruned”, that tightly integrates pruning with early abandoning. We apply it to six elastic distance measures: DTW, CDTW, WDTW, ERP, MSM and TWE, showing substantial speedup in NN search applications. Pruning alone also shows substantial speedup for some distances, benefiting applications beyond the scope of NN search (e.g. requiring all pairwise distances), and hence where early abandoning is not applicable. We release our implementation as part of a new C++ library for time series classification, along with easy to use Python/Numpy bindings.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
In ERP, the gap cost is given by \({{\,\mathrm{cost}\,}}(s_i, g)\). If S is translated, the gap cost also changes while the canonical alignment cost remains unchanged, making ERP translation-sensitive.
Similar to discard points, instructing the next rows to skip their column.
References
Bagnall A, Flynn M, Large J, Lines J, Middlehurst M (2020) A tale of two toolkits, report the third: On the usage and performance of HIVE-COTE v1.0. arXiv:2004.06069
Benkabou SE, Benabdeslem K, Canitia B (2018) Unsupervised outlier detection for time series by entropy and dynamic time warping. Knowl Inf Syst 54(2):463–486
Chen L, Ng R (2004) On the marriage of lp-norms and edit distance. In: Proceedings 2004 VLDB Conference, pp 792 – 803
Dau HA, Bagnall A, Kamgar K, Yeh CCM, Zhu Y, Gharghabi S, Ratanamahatana CA, Keogh E (2019) The UCR Time Series Archive. arXiv:1810.07758
Herrmann M (2021a) Experimentation source code and ressources. https://github.com/HerrmannM/paper-2021-EAPElasticDist
Herrmann M (2021b) Tempo, the Monash time series classification library. https://github.com/MonashTS/tempo
Itakura F (1975) Minimum prediction residual principle applied to speech recognition. IEEE Trans Acoust Speech Signal Process 23(1):67–72
Jeong YS, Jeong MK, Omitaomu OA (2011) Weighted dynamic time warping for time series classification. Pattern Recogn 44(9):2231–2240. https://doi.org/10.1016/j.patcog.2010.09.022
Keogh E, Ratanamahatana CA (2005) Exact indexing of dynamic time warping. Knowl Inf Syst 7(3):358–386. https://doi.org/10.1007/s10115-004-0154-9
Keogh EJ, Pazzani MJ (2001) Derivative Dynamic Time Warping. In: Proceedings of the 2001 SIAM International Conference on Data Mining, Society for Industrial and Applied Mathematics, pp 1–11, https://doi.org/10.1137/1.9781611972719.1
Lemire D (2009) Faster retrieval with a two-pass dynamic-time-warping lower bound. Pattern Recogn 42(9):2169–2180. https://doi.org/10.1016/j.patcog.2008.11.030
Lines J, Bagnall A (2015) Time series classification with ensembles of elastic distance measures. Data Min Knowl Disc 29(3):565–592. https://doi.org/10.1007/s10618-014-0361-2
Lines J, Taylor S, Bagnall A (2018) Time Series Classification with HIVE-COTE: The Hierarchical Vote Collective of Transformation-Based Ensembles. ACM Trans Knowl Disc Data 12(5):1–35. https://doi.org/10.1145/3182382
Lucas B, Shifaz A, Pelletier C, O’Neill L, Zaidi N, Goethals B, Petitjean F, Webb GI (2019) Proximity Forest: An effective and scalable distance-based classifier for time series. Data Min Knowl Disc 33(3):607–635. https://doi.org/10.1007/s10618-019-00617-3
Marteau PF (2009) Time Warp Edit Distance with Stiffness Adjustment for Time Series Matching. IEEE Trans Pattern Anal Mach Intell 31(2):306–318. https://doi.org/10.1109/TPAMI.2008.76
Mueen A, Keogh E (2016) Extracting optimal performance from dynamic time warping. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - KDD ’16, ACM Press, pp 2129–2130, https://doi.org/10.1145/2939672.2945383
Petitjean F, Ketterlin A, Gançarski P (2011) A global averaging method for dynamic time warping, with applications to clustering. Pattern Recogn 44(3):678–693. https://doi.org/10.1016/j.patcog.2010.09.013
Rakthanmanon T, Campana B, Mueen A, Batista G, Westover B, Zhu Q, Zakaria J, Keogh E (2012) Searching and mining trillions of time series subsequences under dynamic time warping. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - KDD ’12, ACM Press, Beijing, China, p 262, https://doi.org/10.1145/2339530.2339576
Sakoe H, Chiba S (1971) A dynamic programming approach to continuous speech recognition. Proceedings of the Seventh International Congress on Acoustics, Budapest, Akadémiai Kiadó, Budapest 3:65–69
Sakoe H, Chiba S (1978) Dynamic programming algorithm optimization for spoken word recognition. IEEE Trans Acoust Speech Signal Process 26(1):43–49. https://doi.org/10.1109/TASSP.1978.1163055
Salvador S, Chan P (2007) Toward accurate dynamic time warping in linear time and space. Intell Data Analysis 11(5):561–580. https://doi.org/10.3233/IDA-2007-11508
Sang-Wook Kim, Sanghyun Park, Chu W (2001) An index-based approach for similarity search supporting time warping in large sequence databases. In: Proceedings 17th International Conference on Data Engineering, IEEE Comput. Soc, pp 607–614, https://doi.org/10.1109/ICDE.2001.914875, http://ieeexplore.ieee.org/document/914875/
Shifaz A, Pelletier C, Petitjean F, Webb GI (2020) TS-CHIEF: A Scalable and Accurate Forest Algorithm for Time Series Classification. arXiv:190610329 [cs, stat]
Silva DF, Batista GEAPA (2016) Speeding Up All-Pairwise Dynamic Time Warping Matrix Calculation. In: Proceedings of the 2016 SIAM International Conference on Data Mining, Society for Industrial and Applied Mathematics, pp 837–845, https://doi.org/10.1137/1.9781611974348.94
Silva DF, Giusti R, Keogh E, Batista GEAPA (2018) Speeding up similarity search under dynamic time warping by pruning unpromising alignments. Data Min Knowl Disc 32(4):988–1016. https://doi.org/10.1007/s10618-018-0557-y
Stefan A, Athitsos V, Das G (2013) The Move-Split-Merge Metric for Time Series. IEEE Trans Knowl Data Eng 25(6):1425–1438. https://doi.org/10.1109/TKDE.2012.88
Tan CW, Petitjean F, Webb GI (2020) FastEE: Fast ensembles of elastic distances for time series classification. Data Min Knowl Disc 34(1):231–272. https://doi.org/10.1007/s10618-019-00663-x
Tan CW, Bergmeir C, Petitjean F, Webb GI (in press) Time series extrinsic regression. Data Mining and Knowledge Discovery https://doi.org/10.1007/s10618-021-00745-9
Webb GI, Petitjean F (2021) Tight lower bounds for dynamic time warping. Pattern Recogn. https://doi.org/10.1016/j.patcog.2021.107895
Wu R, Keogh EJ (2020) FastDTW is approximate and generally slower than the algorithm it approximates. IEEE Transactions on Knowledge and Data Engineering https://doi.org/10.1109/TKDE.2020.3033752
Zhu Q, Batista G, Rakthanmanon T, Keogh E (2012) A Novel Approximation to Dynamic Time Warping allows Anytime Clustering of Massive Time Series Datasets. In: Proceedings of the 2012 SIAM International Conference on Data Mining, Society for Industrial and Applied Mathematics, pp 999–1010, https://doi.org/10.1137/1.9781611972825.86
Author information
Authors and Affiliations
Corresponding author
Additional information
Responsible editor: Annalisa Appice, Sergio Escalera, Jose A. Gamez, Heike Trautman.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This research has been supported by Australian Research Council Grant DP210100072.
Rights and permissions
About this article
Cite this article
Herrmann, M., Webb, G.I. Early abandoning and pruning for elastic distances including dynamic time warping. Data Min Knowl Disc 35, 2577–2601 (2021). https://doi.org/10.1007/s10618-021-00782-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10618-021-00782-4