Abstract
In time series classification, one of the most popular models is Bag-Of-Patterns (BOP). Most BOP methods run in super-linear time. A recent work proposed a linear time BOP model, yet it has limited accuracy. In this work, we present Hybrid Bag-Of-Patterns (HBOP), which can greatly enhance accuracy while maintaining linear complexity. Concretely, we first propose a novel time series discretization method called SLA, which can retain more information than the classic SAX. We use a hybrid of SLA and SAX to expressively and compactly represent subsequences, which is our most important design feature. Moreover, we develop an efficient time series transformation method that is key to achieving linear complexity. We also propose a novel X-means clustering subroutine to handle subclasses. Extensive experiments on over 100 datasets demonstrate the effectiveness and efficiency of our method.
This work is funded by NSFC Grant 61672161 and Dongguan Innovative Research Team Program 2018607201008. We sincerely thank Dr Hoang Anh Dau from University of California, Riverside for responding to our inquiries on the UCR Archive [5].
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Here we have ignored the Fungi dataset on which we have obtained abnormally short classification time for BOPF. See [1] for more on this.
References
HBOP code. https://github.com/sliang11/Hybrid-Bag-Of-Patterns
Bagnall, A., Lines, J., Hills, J., Bostrom, A.: Time-series classification with COTE: the collective of transformation-based ensembles. IEEE Trans. Knowl. Data Eng. 27(9), 2522–2535 (2015)
Calbimonte, J., Yan, Z., Jeung, H., Corcho, Ó., Aberer, K.: Deriving semantic sensor metadata from raw measurements. In: SSN 2012, pp. 33–48 (2012)
Dau, H.A., et al.: The UCR Time Series Archive. CoRR abs/1810.07758 (2018)
Dau, H.A., et al.: Hexagon-ML: The UCR Time Series Classification Archive, October 2018. https://www.cs.ucr.edu/~eamonn/time_series_data_2018/
Demsar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)
Grabocka, J., Schilling, N., Wistuba, M., Schmidt-Thieme, L.: Learning time-series shapelets. In: KDD 2014, pp. 392–401 (2014)
Grabocka, J., Wistuba, M., Schmidt-Thieme, L.: Scalable classification of repetitive time series through frequencies of local polynomials. IEEE Trans. Knowl. Data Eng. 27(6), 1683–1695 (2015)
Hills, J., Lines, J., Baranauskas, E., Mapp, J., Bagnall, A.: Classification of time series by shapelet transformation. Data Min. Knowl. Disc. 28(4), 851–881 (2014). https://doi.org/10.1007/s10618-013-0322-1
Jessica, L., Rohan, K., Yuan, L.: Rotation-invariant similarity in time series using bag-of-patterns representation. J. Intell. Inf. Syst. 39(2), 287–315 (2012)
Keogh, E., Chakrabarti, K., Pazzani, M., Mehrotra, S.: Dimensionality reduction for fast similarity search in large time series databases. Knowl. Inf. Syst. 3(3), 263–286 (2001)
Li, X., Lin, J.: Linear time complexity time series classification with bag-of-pattern-features. In: ICDM 2017, pp. 277–286 (2017)
Lin, J., Keogh, E., Li, W., Lonardi, S.: Experiencing SAX: a novel symbolic representation of time series. Data Min. Knowl. Disc. 15(2), 107 (2007). https://doi.org/10.1007/s10618-007-0064-z
Malinowski, S., Guyet, T., Quiniou, R., Tavenard, R.: 1d-SAX: a novel symbolic representation for time series. In: Tucker, A., Höppner, F., Siebes, A., Swift, S. (eds.) IDA 2013. LNCS, vol. 8207, pp. 273–284. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-41398-8_24
Mueen, A., Keogh, E., Young, N.: Logical-shapelets: an expressive primitive for time series classification. In: KDD 2011, pp. 1154–1162 (2011)
Pelleg, D., Moore, A.: X-means: extending K-means with efficient estimation of the number of clusters. In: ICML 2000, pp. 727–734 (2000)
Rakthanmanon, T., Keogh, E.: Fast shapelets: a scalable algorithm for discovering time series shapelets. In: SDM 2013, pp. 668–676 (2013)
Schäfer, P.: The BOSS is concerned with time series classification in the presence of noise. Data Min. Knowl. Disc. 29(6), 1505–1530 (2015). https://doi.org/10.1007/s10618-014-0377-7
Schäfer, P.: Scalable time series classification. Data Min. Knowl. Disc. 30(5), 1273–1298 (2016). https://doi.org/10.1007/s10618-015-0441-y
Schäfer, P., Högqvist, M.: SFA: a symbolic fourier approximation and index for similarity search in high dimensional datasets. In: EDBT 2012, pp. 516–527 (2012)
Schäfer, P., Leser, U.: Fast and accurate time series classification with WEASEL. In: CIKM 2017, pp. 637–646 (2017)
Senin, P., Malinchik, S.: SAX-VSM: interpretable time series classification using SAX and vector space model. In: ICDM 2013, pp. 1175–1180 (2013)
Wang, X., Mueen, A., Ding, H., Trajcevski, G., Scheuermann, P., Keogh, E.: Experimental comparison of representation methods and distance measures for time series data. Data Min. Knowl. Disc. 26(2), 275–309 (2013). https://doi.org/10.1007/s10618-012-0250-5
Ye, L., Keogh, E.: Time series shapelets: a novel technique that allows accurate, interpretable and fast classification. Data Min. Knowl. Disc. 22(1–2), 149–182 (2011). https://doi.org/10.1007/s10618-010-0179-5
Yuan, J., Lin, Q., Zhang, W., Wang, Z.: Locally slope-based dynamic time warping for time series classification. In: Proceedings of the 28th ACM International Conference on Information and Knowledge Management, CIKM 2019, pp. 1713–1722 (2019)
Zhao, J., Itti, L.: shapeDTW: shape dynamic time warping. Pattern Recogn. 74, 171–184 (2018)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Liang, S., Zhang, Y., Ma, J. (2020). Enhancing Linear Time Complexity Time Series Classification with Hybrid Bag-Of-Patterns. In: Nah, Y., Cui, B., Lee, SW., Yu, J.X., Moon, YS., Whang, S.E. (eds) Database Systems for Advanced Applications. DASFAA 2020. Lecture Notes in Computer Science(), vol 12112. Springer, Cham. https://doi.org/10.1007/978-3-030-59410-7_50
Download citation
DOI: https://doi.org/10.1007/978-3-030-59410-7_50
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-59409-1
Online ISBN: 978-3-030-59410-7
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)