article

Learning States and Rules for Detecting Anomalies in Time Series

Authors:

Philip ChanAuthors Info & Claims

Applied Intelligence, Volume 23, Issue 3

Pages 241 - 255

https://doi.org/10.1007/s10489-005-4610-3

Published: 01 December 2005 Publication History

Abstract

The normal operation of a device can be characterized in different temporal states. To identify these states, we introduce a segmentation algorithm called Gecko that can determine a reasonable number of segments using our proposed L method. We then use the RIPPER classification algorithm to describe these states in logical rules. Finally, transitional logic between the states is added to create a finite state automaton. Our empirical results, on data obtained from the NASA shuttle program, indicate that the Gecko segmentation algorithm is comparable to a human expert in identifying states, and our L method performs better than the existing permutation tests method when determining the number of segments to return in segmentation algorithms. Empirical results have also shown that our overall system can track normal behavior and detect anomalies.

References

[1]

1. W. Cohen, "Fast effective rule induction," in Proc. of the 12 Intl. Conference on Machine Learning , Tahoe City, CA, 1995, pp. 115-123.

[2]

2. S. Salvador and P. Chan, "Determining the number of clusters/segments in hierarchical clustering/segmentation algorithms," Laboratory for Learning Research, Florida Institute of Technology, Melbourne, FL, Technical Report TR-2003-18, 2003.

[3]

3. R. Ng and J. Hah, "Efficient and effective clustering methods for spatial data mining," in The 20th Intl. Conf. On Very Large Data Bases , Santiago, Chile, 1994, pp. 12-15.

[4]

4. M. Ester, H. Kriegel, J. Sander, and X. Xu, "A density-based algorithm for discovering clusters in large spatial databases with noise," in Proc. 3rd Intl. Conf. on Knowledge Discovery and Data Mining , Portland OR, 1996, pp. 226-231.

[5]

5. A. Hinneburg and D. Keim, "An efficient approach to clustering in large multimedia databases with noise," in Proc 4th Intl. Conf. on Knowledge Discovery and Data Mining . New York City, NY, 1998, pp. 58-65.

[6]

6. S. Guha, R. Rastogi, and K. Shim, "ROCK: A robust clustering algorithm for categorical attributes," in The 15th Intl. Conf. on Data Engineering , Sydney, Australia, 1999, pp. 512-523.

[7]

7. G. Karypis, E. Han, and V. Kumar, "Chameleon: A hierarchical clustering algorithm using dynamic modeling," IEEE Computer , vol. 32, no 8, pp. 68-75, 1999.

Digital Library

[8]

8. G. Seikholeslami, S. Chatterjee and A. Zhang, "WaveCluster: A multi-resolution clustering approach for very large spatial databases," in Proc. of the 24th VLDB , New York City, New York, 1998, pp. 428-439.

[9]

9. E. Keogh, S. Chu, D. Hart and M. Pazanni, "An online algorithm for segmenting time series," in Proc. IEEE Intl. Conf. on Data Mining , San Jose, CA, 2001, pp. 289-296.

[10]

10. P. Smyth, "Clustering using Monte-Carlo cross-validation," in Proc. 2nd KDD , Portland, OR, 1996, pp.126-133.

[11]

11. R. Baxter and J. Oliver, "The kindest cut: minimum message length segmentation," in Algorithmic Learning Theory, 7th Intl. Workshop , Sydney, Australia, 1996, pp. 83-90.

[12]

12. M. Hansen and B. Yu, "Model selection and the principle of minimum description length," JASA , vol. 96, pp. 746-774, 2001.

[13]

13. C. Fraley and E. Raftery, "How many clusters? Which clustering method? Answers via model-based Cluster Analysis," Computer Journal , vol. 41, pp. 578-588, 1998.

[14]

14. M. Sugiyama and H. Ogawa, Subspace Information criterion for model selection, Neural Computation , vol. 13, no. 8, pp. 1863- 1889, 2001.

Digital Library

[15]

15. K. Vasko and T. Toivonen. "Estimating the number of segments in time series data using permutation tests," in Proc. IEEE Intl. Conf. on Data Mining , Maebashi City, Japan, 2002, pp. 466-47.

[16]

16. V. Roth, T. Lange, M. Braun, and J. Buhmann, "A resampling approach to cluster validation," in Proc. in Computational Statistics: 15th Symposium (COMPSTAT2002) , Berlin, Germany, 2002, pp. 123-128.

[17]

17. S. Monti, P. Tamayo, J. Mesirov, and T Golub, "Consensus clustering: A resampling-based method for class discovery and visualization of gene expression microarray data," Machine Learning , vol. 52, nos. 1-2, pp. 91-118, 2003.

[18]

18. R. Tibshirani, G. Walther, and T. Hastie, " Estimating the number of clusters in a dataset via the Gap statistic," Dept. of Biostatistics, Stanford Univ., Stanford, CA, Technical Report 208, 2001.

[19]

19. R. Tibshirani, G. Walther B. Botstein, and P. Brown, "Cluster validation by prediction strength," Dept. of Biostatistics, Stanford Univ., Stanford, CA, Technical Report 2001-21, 2001.

[20]

20. T. Chiu, D. Fang, J. Chen, Y. Wang, and C. Jeris, "A robust and scalable clustering algorithm for mixed type attributes in large database environment," in Proc. of the 7th ACM SIGKDD Intl. Conf. on Knowledge Discovery and Data Mining , San Francisco, CA, 2001, pp. 263-268.

[21]

21. A. Foss and A. Zaïane, "A parameterless method for efficiently discovering clusters of arbitrary shape in large datasets." in Proc. of the 2002 IEEE Intl. Conf. on Data Mining (ICDM'02) , Maebashi City, Japan, 2002, pp. 179-186.

[22]

22. S. Harris, D. Hess, and J. Venegas, "An objective analysis of the pressure-volume curve in the acute respiratory distress syndrome," American Journal of Respiratory and Critical Care Medicine , vol. 161, no. 2, pp. 432-439, 2000.

[23]

23. D. Dasgupta and S. Forrest, "novelty detection in time series data using ideas from immunology," In Proc. Fifth Intl. Conf. on Intelligent Systems , Reno, NV, 1996, pp. 82-87.

[24]

24. T. Caudell and D. Newman, "An adaptive resonance architecture to define normality and detect novelties in time series and databases," in Proc. IEEE World Congress on Neural Networks , Portland, OR, pp. IV166-176. 1993.

[25]

25. P. Langley, D. George, S. Bay, and K. Saito, "Robust induction of process models from time-series data," in Proc. of the 20th Intl. Conf. on Machine Learning, Washington , DC, 2003, pp. 32-439.

[26]

26. E. Weisstein, "Least squares fitting," From Math World-A Wolfram Web Resource. {http://mathworld.wolfram.com/Least-SquaresFitting.html}.

[27]

27. J. Furnkranz and G. Wildmer, "Incremental reduced error pruning," in Proc. Intl. Conf. on Machine Learning , New Brunswick, NJ, 1994, pp. 70-77.

[28]

28. E. Keogh and T. Folias, The UCR Time Series Data Mining Archive {http://www.cs.ucr.edu/~eamonn/TSDMA/index. html}. Riverside, CA. University of California--Computer Science and Engineering Department, 2004.

Cited By

Schmidl SWenig PPapenbrock T(2022)Anomaly detection in time seriesProceedings of the VLDB Endowment10.14778/3538598.353860215:9(1779-1797)Online publication date: 1-May-2022
https://dl.acm.org/doi/10.14778/3538598.3538602
Thi Thu Huynh TTuan Duong AThi Ngoc Vo C(2021)A new discord definition and an efficient time series discord detection method using GPUsProceedings of the 2021 3rd International Conference on Software Engineering and Development10.1145/3507473.3507483(63-70)Online publication date: 19-Nov-2021
https://dl.acm.org/doi/10.1145/3507473.3507483
Iegorov OFischmeister S(2020)Parameterless Semi-supervised Anomaly Detection in Univariate Time SeriesMachine Learning and Knowledge Discovery in Databases10.1007/978-3-030-67658-2_37(644-659)Online publication date: 14-Sep-2020
https://dl.acm.org/doi/10.1007/978-3-030-67658-2_37
Show More Cited By

Index Terms

Learning States and Rules for Detecting Anomalies in Time Series
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Unsupervised learning
        Cluster analysis
2. Mathematics of computing
  1. Probability and statistics
    1. Statistical paradigms
      1. Time series analysis

Recommendations

Efficient segmentation-based methods for anomaly detection in static and streaming time series under dynamic time warping
Abstract
The problem of time series anomaly detection has attracted a lot of attention due to its usefulness in various application domains. However, most of the methods proposed so far used Euclidean distance to deal with this problem. Dynamic Time ...
Anomaly and change point detection for time series with concept drift
Abstract
Anomaly detection is one of the most important research contents in time series data analysis, which is widely used in many fields. In real world, the environment is usually dynamically changing, and the distribution of data changes over time, ...
A Self-Learning and Online Algorithm for Time Series Anomaly Detection, with Application in CPU Manufacturing
CIKM '16: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management

The problem of anomaly detection in time series has received a lot of attention in the past two decades. However, existing techniques cannot locate where the anomalies are within anomalous time series, or they require users to provide the length of ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Applied Intelligence

Applied Intelligence Volume 23, Issue 3

December 2005

165 pages

ISSN:0924-669X

Issue’s Table of Contents

Copyright © Copyright © 2005 Springer Science + Business Media, Inc.

Publisher

Kluwer Academic Publishers

United States

Publication History

Published: 01 December 2005

Author Tags

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

11
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 18 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Schmidl SWenig PPapenbrock T(2022)Anomaly detection in time seriesProceedings of the VLDB Endowment10.14778/3538598.353860215:9(1779-1797)Online publication date: 1-May-2022
https://dl.acm.org/doi/10.14778/3538598.3538602
Thi Thu Huynh TTuan Duong AThi Ngoc Vo C(2021)A new discord definition and an efficient time series discord detection method using GPUsProceedings of the 2021 3rd International Conference on Software Engineering and Development10.1145/3507473.3507483(63-70)Online publication date: 19-Nov-2021
https://dl.acm.org/doi/10.1145/3507473.3507483
Iegorov OFischmeister S(2020)Parameterless Semi-supervised Anomaly Detection in Univariate Time SeriesMachine Learning and Knowledge Discovery in Databases10.1007/978-3-030-67658-2_37(644-659)Online publication date: 14-Sep-2020
https://dl.acm.org/doi/10.1007/978-3-030-67658-2_37
Amin MGarg PCoskun BCavallaro LKinder JAfroz SBiggio BCarlini NElovici YShabtai A(2019)CADENCEProceedings of the 12th ACM Workshop on Artificial Intelligence and Security10.1145/3338501.3357368(71-82)Online publication date: 11-Nov-2019
https://dl.acm.org/doi/10.1145/3338501.3357368
Beggel LKausler BSchiegg MPfeiffer MBischl B(2019)Time series anomaly detection based on shapelet learningComputational Statistics10.1007/s00180-018-0824-934:3(945-976)Online publication date: 1-Sep-2019
https://dl.acm.org/doi/10.1007/s00180-018-0824-9
Cuper MLóderer MRozinajová V(2019)Detection of Abnormal Load Consumption in the Power Grid Using Clustering and Statistical AnalysisIntelligent Data Engineering and Automated Learning – IDEAL 201910.1007/978-3-030-33607-3_50(464-475)Online publication date: 14-Nov-2019
https://dl.acm.org/doi/10.1007/978-3-030-33607-3_50
Benkabou SBenabdeslem KCanitia B(2018)Unsupervised outlier detection for time series by entropy and dynamic time warpingKnowledge and Information Systems10.1007/s10115-017-1067-854:2(463-486)Online publication date: 1-Feb-2018
https://dl.acm.org/doi/10.1007/s10115-017-1067-8
Nguyen DCios K(2015)Rule-based OneClass-DS learning algorithmApplied Soft Computing10.1016/j.asoc.2015.05.04335:C(267-279)Online publication date: 1-Oct-2015
https://dl.acm.org/doi/10.1016/j.asoc.2015.05.043
Zhang SZhang RMuthuraman SJiang J(2007)Feasibility of one-class-SVM for anomaly detection in telecommunication networkProceedings of the 6th WSEAS international conference on Computational intelligence, man-machine systems and cybernetics10.5555/1984502.1984542(220-224)Online publication date: 14-Dec-2007
https://dl.acm.org/doi/10.5555/1984502.1984542
Zhang RZhang SMuthuraman SJiang J(2007)One class support vector machine for anomaly detection in the communication network performance dataProceedings of the 5th conference on Applied electromagnetics, wireless and optical communications10.5555/1503549.1503556(31-37)Online publication date: 14-Dec-2007
https://dl.acm.org/doi/10.5555/1503549.1503556
Show More Cited By

View Options

View options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Issue’s Table of Contents