Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article
Open access

Time Series Classification with HIVE-COTE: The Hierarchical Vote Collective of Transformation-Based Ensembles

Published: 05 July 2018 Publication History

Abstract

A recent experimental evaluation assessed 19 time series classification (TSC) algorithms and found that one was significantly more accurate than all others: the Flat Collective of Transformation-based Ensembles (Flat-COTE). Flat-COTE is an ensemble that combines 35 classifiers over four data representations. However, while comprehensive, the evaluation did not consider deep learning approaches. Convolutional neural networks (CNN) have seen a surge in popularity and are now state of the art in many fields and raises the question of whether CNNs could be equally transformative for TSC.
We implement a benchmark CNN for TSC using a common structure and use results from a TSC-specific CNN from the literature. We compare both to Flat-COTE and find that the collective is significantly more accurate than both CNNs. These results are impressive, but Flat-COTE is not without deficiencies. We significantly improve the collective by proposing a new hierarchical structure with probabilistic voting, defining and including two novel ensemble classifiers built in existing feature spaces, and adding further modules to represent two additional transformation domains. The resulting classifier, the Hierarchical Vote Collective of Transformation-based Ensembles (HIVE-COTE), encapsulates classifiers built on five data representations. We demonstrate that HIVE-COTE is significantly more accurate than Flat-COTE (and all other TSC algorithms that we are aware of) over 100 resamples of 85 TSC problems and is the new state of the art for TSC. Further analysis is included through the introduction and evaluation of 3 new case studies and extensive experimentation on 1,000 simulated datasets of 5 different types.

References

[1]
A. Bagnall, A. Bostrom, J. Large, and J. Lines. 2016. Simulated Data Experiments for Time Series Classification Part 1: Accuracy Comparison with Default Settings. Technical Report. School of Computing Sciences, University of East Anglia.
[2]
A. Bagnall, L. M. Davis, J. Hills, and J. Lines. 2012. Transformation based ensembles for time series classification. In Proceedings of the 2012 SIAM International Conference on Data Mining, Vol. 12. 307--318.
[3]
A. Bagnall and G. Janacek. 2014. A run length transformation for discriminating between auto regressive time series. Journal of Classification 31, 2 (2014), 154--178.
[4]
A. Bagnall, J. Lines, A. Bostrom, J. Large, and E. Keogh. 2017. The great time series classification bake off: A review and experimental evaluation of recent algorithmic advance. Data Mining and Knowledge Discovery 31, 3 (2017), 606--660.
[5]
A. Bagnall, J. Lines, J. Hills, and A. Bostrom. 2015. Time-series classification with COTE: The collective of transformation-based ensembles. IEEE Transactions on Knowledge and Data Engineering 27, 9 (2015), 2522--2535.
[6]
G. Batista, E. Keogh, O. Tataw, and V. deSouza. 2014. CID: An efficient complexity-invariant distance measure for time series. Data Mining and Knowledge Discovery 28, 3 (2014), 634--669.
[7]
M. Baydogan and G. Runger. 2016. Time series representation and similarity based on local autopatterns. Data Mining and Knowledge Discovery 30, 2 (2016), 476--509.
[8]
M. Baydogan, G. Runger, and E. Tuv. 2013. A bag-of-features framework to classify time series. IEEE Transactions on Pattern Analysis and Machine Intelligence 25, 11 (2013), 2796--2802.
[9]
A. Benavoli, G. Corani, and F. Mangili. 2016. Should we really use post-hoc tests based on mean-ranks?Journal of Machine Learning Research 17 (2016), 1--10.
[10]
A. Bostrom and A. Bagnall. 2015. Binary shapelet transform for multiclass time series classification. In Proceedings of the 17th International Conference on Big Data Analytics and Knowledge Discovery (DAWAK’15).
[11]
L. Breiman. 1996. Bagging predictors. Machine Learning 24, 2 (1996), 123--140.
[12]
Leo Breiman. 2001. Random forests. Machine Learning 45, 1 (2001), 5--32.
[13]
J. Caiado, N. Crato, and D. Pena. 2006. A periodogram-based metric for time series classification. Computational Statistics and Data Analysis 50 (2006), 2668--2684.
[14]
M. Cooke, J. Barker, S. Cunningham, and X. Shao. 2006. An audio-visual corpus for speech perception and automatic speech recognition. The Journal of the Acoustical Society of America 120, 5 (2006), 2421--2424.
[15]
M. Corduas and D. Piccolo. 2008. Time series clustering and classification by the autoregressive metric. Computational Statistics and Data Analysis 52, 4 (2008), 1860--1872.
[16]
Z. Cui, W. Chen, and Y. Chen. 2016. Multi-scale convolutional neural networks for time series classification. arXiv:1603.06995.
[17]
J. Demšar. 2006. Statistical comparisons of classifiers over multiple data sets. Journal of Machine Learning Research 7 (2006), 1--30.
[18]
H. Deng, G. Runger, E. Tuv, and M. Vladimir. 2013. A time series forest for classification and feature extraction. Information Sciences 239 (2013), 142--153.
[19]
Y. Freund and R. Schapire. 1996. Experiments with a new boosting algorithm. In Proceedings of ICML, Vol. 96. 148--156.
[20]
M. Fernández-Delgado, E. Cernadas, S. Barro, and D. Amorim. 2014. Do we need hundreds of classifiers to solve real world classification problems? Journal of Machine Learning Research 15 (2014), 3133--3181.
[21]
B. Fulcher and N. Jones. 2014. Highly comparative feature-based time-series classification. IEEE Transactions on Knowledge and Data Engineering 26, 12 (2014), 3026--3037.
[22]
S. García and F. Herrera. 2008. An extension on statistical comparisons of classifiers over multiple data set for all pairwise comparisons. Journal of Machine Learning Research 9 (2008), 2677--2694.
[23]
T. Górecki and M. Łuczak. 2014. Non-isometric transforms in time series classification using DTW. Knowledge-Based Systems 61 (2014), 98--108.
[24]
J. Grabocka, N. Schilling, M. Wistuba, and L. Schmidt-Thieme. 2014. Learning time-series shapelets. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.
[25]
A. Graves, A. Mohamed, and G. Hinton. 2013. Speech recognition with deep recurrent neural networks. In Proceedings of the 2013 IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE, 6645--6649.
[26]
M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, and I. Witten. 2009. The WEKA data mining software: An update. SIGKDD Explorations 11, 1 (2009), 10--18.
[27]
J. Hills, J. Lines, E. Baranauskas, J. Mapp, and A. Bagnall. 2014. Classification of time series by shapelet transformation. Data Mining and Knowledge Discovery 28, 4 (2014), 851--881.
[28]
Y. Jeong, M. Jeong, and O. Omitaomu. 2011. Weighted dynamic time warping for time series classification. Pattern Recognition 44, 9 (2011), 2231--2240.
[29]
N. Kalchbrenner, E. Grefenstette, and P. Blunsom. 2014. A convolutional neural network for modelling sentences. arXiv:1404.2188.
[30]
R. Kate. 2016. Using dynamic time warping distances as features for improved time series classification. Data Mining and Knowledge Discovery 30, 2 (2016), 283--312.
[31]
A. Krizhevsky, I. Sutskever, and G. Hinton. 2012. ImageNet classification with deep convolutional neural networks. In Proceedings of the 25th International Conference on Neural Information Processing Systems. Curran Associates, Inc., 1097--1105.
[32]
J. Lin, R. Khade, and Y. Li. 2012. Rotation-invariant similarity in time series using bag-of-patterns representation. Journal of Intelligent Information Systems 39, 2 (2012), 287--315.
[33]
J. Lines and A. Bagnall. 2015. Time series classification with ensembles of elastic distance measures. Data Mining and Knowledge Discovery 29, 3 (2015), 565--592.
[34]
J. Lines, L. Davis, J. Hills, and A. Bagnall. 2012. A shapelet transform for time series classification. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.
[35]
J. Lines, S. Taylor, and A. Bagnall. 2016. HIVE-COTE: The hierarchical vote collective of transformation-based ensembles for time series classification. In Proceedings of the IEEE International Conference on Data Mining.
[36]
P. Marteau. 2009. Time warp edit distance with stiffness adjustment for time series matching. IEEE Transactions on Pattern Analysis and Machine Intelligence 31, 2 (2009), 306--318.
[37]
C. Ratanamahatana and E. Keogh. 2005. Three myths about dynamic time warping data mining. In Proceedings of the 5th SIAM International Conference on Data Mining (SDM’05).
[38]
Juan José Rodriguez, Ludmila I. Kuncheva, and Carlos J. Alonso. 2006. Rotation forest: A new classifier ensemble method. IEEE Transactions on Pattern Analysis and Machine Intelligence 28, 10 (2006), 1619--1630.
[39]
P. Schäfer. 2015. The BOSS is concerned with time series classification in the presence of noise. Data Mining and Knowledge Discovery 29, 6 (2015), 1505--1530.
[40]
A. Stefan, V. Athitsos, and G. Das. 2013. The move-split-merge metric for time series. IEEE Transactions on Knowledge and Data Engineering 25, 6 (2013), 1425--1438.
[41]
Theano Development Team. 2016. Theano: A python framework for fast computation of mathematical expressions. arXiv e-prints abs/1605.02688.
[42]
J. Villar, P. Vergara, M. Menéndez, E. de la Cal, V. González, and J. Sedano. 2016. Generalized models for the classification of abnormal movements in daily life and its applicability to epilepsy convulsions recognition. International Journal of Neural Systems 26, 6 (2016), 1650037.
[43]
G. Webb. 2000. Multiboosting: A technique for combining boosting and wagging. Machine Learning 40, 2 (2000), 159--196.
[44]
L. Ye and E. Keogh. 2011. Time series shapelets: A novel technique that allows accurate, interpretable and fast classification. Data Mining and Knowledge Discovery 22, 1--2 (2011), 149--182.

Cited By

View all
  • (2025)Time-series classification in smart manufacturing systems: An experimental evaluation of state-of-the-art machine learning algorithmsRobotics and Computer-Integrated Manufacturing10.1016/j.rcim.2024.10283991(102839)Online publication date: Feb-2025
  • (2024)Time Series Classification: A Review of Algorithms and ImplementationsTime Series Analysis - Recent Advances, New Perspectives and Applications10.5772/intechopen.1004810Online publication date: 25-Mar-2024
  • (2024)Time-Series Mining Approaches for Malaria Vector Prediction On Mid-Infrared Spectroscopy DataData Science Journal10.5334/dsj-2024-02523Online publication date: 1-May-2024
  • Show More Cited By

Index Terms

  1. Time Series Classification with HIVE-COTE: The Hierarchical Vote Collective of Transformation-Based Ensembles

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image ACM Transactions on Knowledge Discovery from Data
      ACM Transactions on Knowledge Discovery from Data  Volume 12, Issue 5
      October 2018
      354 pages
      ISSN:1556-4681
      EISSN:1556-472X
      DOI:10.1145/3234931
      Issue’s Table of Contents
      This work is licensed under a Creative Commons Attribution International 4.0 License.

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 05 July 2018
      Accepted: 01 January 2018
      Revised: 01 December 2017
      Received: 01 April 2017
      Published in TKDD Volume 12, Issue 5

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Time series classification
      2. deep learning
      3. heterogeneous ensembles
      4. meta ensembles

      Qualifiers

      • Research-article
      • Research
      • Refereed

      Funding Sources

      • UK Engineering and Physical Sciences Research Council (EPSRC)
      • Research and Specialist Computing Support service at the University of East Anglia

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)575
      • Downloads (Last 6 weeks)66
      Reflects downloads up to 28 Nov 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2025)Time-series classification in smart manufacturing systems: An experimental evaluation of state-of-the-art machine learning algorithmsRobotics and Computer-Integrated Manufacturing10.1016/j.rcim.2024.10283991(102839)Online publication date: Feb-2025
      • (2024)Time Series Classification: A Review of Algorithms and ImplementationsTime Series Analysis - Recent Advances, New Perspectives and Applications10.5772/intechopen.1004810Online publication date: 25-Mar-2024
      • (2024)Time-Series Mining Approaches for Malaria Vector Prediction On Mid-Infrared Spectroscopy DataData Science Journal10.5334/dsj-2024-02523Online publication date: 1-May-2024
      • (2024)Research on Fault Detection by Flow Sequence for Industrial Internet of Things in Sewage Treatment Plant CaseSensors10.3390/s2407221024:7(2210)Online publication date: 29-Mar-2024
      • (2024)Monitoring Flow-Forming Processes Using Design of Experiments and a Machine Learning Approach Based on Randomized-Supervised Time Series Forest and Recursive Feature EliminationSensors10.3390/s2405152724:5(1527)Online publication date: 27-Feb-2024
      • (2024)Deep Learning for Time Series Classification and Extrinsic Regression: A Current SurveyACM Computing Surveys10.1145/364944856:9(1-45)Online publication date: 25-Apr-2024
      • (2024)RITA: Group Attention is All You Need for Timeseries AnalyticsProceedings of the ACM on Management of Data10.1145/36393172:1(1-28)Online publication date: 26-Mar-2024
      • (2024)A Molecular-Based Authentication and Authorization for Internet of Things Systems2024 IEEE Wireless Communications and Networking Conference (WCNC)10.1109/WCNC57260.2024.10571162(1-6)Online publication date: 21-Apr-2024
      • (2024)Densely Knowledge-Aware Network for Multivariate Time Series ClassificationIEEE Transactions on Systems, Man, and Cybernetics: Systems10.1109/TSMC.2023.334264054:4(2192-2204)Online publication date: Apr-2024
      • (2024)A Survey on Graph Neural Networks for Time Series: Forecasting, Classification, Imputation, and Anomaly DetectionIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2024.344314146:12(10466-10485)Online publication date: Dec-2024
      • Show More Cited By

      View Options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Login options

      Full Access

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media