Nothing Special   »   [go: up one dir, main page]

skip to main content
10.5555/1619645.1619744guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Discovering multivariate motifs using subsequence density estimation and greedy mixture learning

Published: 22 July 2007 Publication History

Abstract

The problem of locating motifs in real-valued, multivariate time series data involves the discovery of sets of recurring patterns embedded in the time series. Each set is composed of several non-overlapping subsequences and constitutes a motif because all of the included subsequences are similar. The ability to automatically discover such motifs allows intelligent systems to form endogenously meaningful representations of their environment through unsupervised sensor analysis. In this paper, we formulate a unifying view of motif discovery as a problem of locating regions of high density in the space of all time series subsequences. Our approach is efficient (sub-quadratic in the length of the data), requires fewer user-specified parameters than previous methods, and naturally allows variable length motif occurrences and non-linear temporal warping. We evaluate the performance of our approach using four data sets from different domains including on-body inertial sensors and speech.

References

[1]
Bailey, T., and Elkan, C. 1994. Fitting a mixture model by expectation maximization to discover motifs in biopolymers. In Second International Conference on Intelligent Systems for Molecular Biology , 28-36. AAAI Press.
[2]
Blekas, K.; Fotiadis, D.; and Likas, A. 2003. Greedy mixture learning for multiple motif discovery in biological sequences. Bioinformatics 19(5).
[3]
Chiu, B.; Keogh, E.; and Lonardi, S. 2003. Probabilistic disovery of time series motifs. In Conf. on Knowledge Discovery in Data, 493-498.
[4]
Comaniciu, D., and Meer, P. 2002. Mean shift: A robust approach toward feature space analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence 24:603-619.
[5]
Denton, A. 2005. Kernel-density-based clustering of time series subsequences using a continuous random-walk noise model. In Proceedings of the Fifth IEEE International Conference on Data Mining.
[6]
Gray, A. G., and Moore, A. W. 2003. Very fast multivariate kernel density estimation via computational geometry. In Proceedings of the ASA Joint Statistical Meeting.
[7]
HTK 2007. HTK Speech Recognition Toolkit. Machine Intelligence Laboratory, Cambridge University. http://htk.eng.cam.ac.uk
[8]
Jensen, K.; Styczynski, M. P.; Rigoutsos, I.; and Stephanopoulos, G. 2006. A generic motif discovery algorithm for sequential data. Bioinformatics 22(1):21-28.
[9]
Keogh, E., and Folias, T. 2002. UCR time series data mining archive.
[10]
Keogh, E., and Ratanamahatana, C. A. 2005. Exact indexing of dynamic time warping. Knowledge and Infonnation Systems 7(3):358-386.
[11]
Keogh, E.; Lin, J.; and Truppel, W. 2003. Clustering of time seies subsequences is meaningless: Implications for past and future research. In ICDM, 115-122.
[12]
Leonard, R. G., and Doddington, G. 1993. TIDIGITS Linguistic Data Consortium, Philadelphia.
[13]
Liu, T. 2006. Fast Nonparametric Machine Learning Algorithms for High-dimensional Massive Data and Applications. Ph.D. Dissertation, Carnegie Mellon University.
[14]
Loftsgaarden, D., and Quesenberry, c. 1965. A nonparametric estimate of a multivariate density function. The Annals of Mathematical Statistics 36:1049-1051.
[15]
Minnen, D.; Starner, T.; Essa, I.; and Isbell, C. 2007. Improving activity discovery with automatic neighborhood estimation. In International Joint Conference on Artificial Intelligence.
[16]
Oates, T. 2002. PERUSE: An unsupervised algorithm for finding recurring patterns in time series. In Int. Conf. on Data Mining, 330-337.
[17]
Rabiner, L., and Juang, B.-H. 1993. Fundamentals of Speech Recognition. Signal Processing Series. Prentice Hall.
[18]
Sain, S. 1999. Multivariate locally adaptive density estimation. Technical report, Department of Statistical Science, Southern Methodist University.
[19]
Silverman, B. 1986. Density Estimation. London: Chapman and Hall.
[20]
Tanaka, Y.; Iwamoto, K.; and Uehara, K. 2005. Discovery of time-series motif from multi-dimensional data based on mdl principle. Machine Learning 58(2-3):269-300.
[21]
Young, S.; Russell, N.; and Thornton, J. 1989. Token passing: a simple conceptual model for connected speech recognition systems. Technical Report 38, Cambridge University.

Cited By

View all
  • (2018)Spatio-Temporal Data MiningACM Computing Surveys10.1145/316160251:4(1-41)Online publication date: 22-Aug-2018
  • (2018)GrammarViz 3.0ACM Transactions on Knowledge Discovery from Data10.1145/305112612:1(1-28)Online publication date: 13-Feb-2018
  • (2018)Exploiting a novel algorithm and GPUs to break the ten quadrillion pairwise comparisons barrier for time series motifs and joinsKnowledge and Information Systems10.1007/s10115-017-1138-x54:1(203-236)Online publication date: 1-Jan-2018
  • Show More Cited By
  1. Discovering multivariate motifs using subsequence density estimation and greedy mixture learning

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image Guide Proceedings
    AAAI'07: Proceedings of the 22nd national conference on Artificial intelligence - Volume 1
    July 2007
    942 pages
    ISBN:9781577353232

    Sponsors

    • Association for the Advancement of Artificial Intelligence

    Publisher

    AAAI Press

    Publication History

    Published: 22 July 2007

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 14 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2018)Spatio-Temporal Data MiningACM Computing Surveys10.1145/316160251:4(1-41)Online publication date: 22-Aug-2018
    • (2018)GrammarViz 3.0ACM Transactions on Knowledge Discovery from Data10.1145/305112612:1(1-28)Online publication date: 13-Feb-2018
    • (2018)Exploiting a novel algorithm and GPUs to break the ten quadrillion pairwise comparisons barrier for time series motifs and joinsKnowledge and Information Systems10.1007/s10115-017-1138-x54:1(203-236)Online publication date: 1-Jan-2018
    • (2016)Latent Time-Series MotifsACM Transactions on Knowledge Discovery from Data10.1145/294032911:1(1-20)Online publication date: 20-Jul-2016
    • (2015)Efficiently discovering frequent motifs in large-scale sensor dataProceedings of the 2nd ACM IKDD Conference on Data Sciences10.1145/2732587.2732601(98-103)Online publication date: 18-Mar-2015
    • (2013)MAFProceedings of the 15th International Conference on Data Warehousing and Knowledge Discovery - Volume 805710.1007/978-3-642-40131-2_32(359-371)Online publication date: 26-Aug-2013
    • (2010)Lag patterns in time series databasesProceedings of the 21st international conference on Database and expert systems applications: Part II10.5555/1887568.1887591(209-224)Online publication date: 30-Aug-2010
    • (2010)Parallel exact time series motif discoveryProceedings of the 16th international Euro-Par conference on Parallel processing: Part II10.5555/1885276.1885307(304-315)Online publication date: 31-Aug-2010
    • (2010)Approximate variable-length time series motif discovery using grammar inferenceProceedings of the Tenth International Workshop on Multimedia Data Mining10.1145/1814245.1814255(1-9)Online publication date: 25-Jul-2010
    • (2010)A tree-construction search approach for multivariate time series motifs discoveryPattern Recognition Letters10.1016/j.patrec.2010.01.00531:9(869-875)Online publication date: 1-Jul-2010
    • Show More Cited By

    View Options

    View options

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media