Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/1247480.1247511acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
Article

Effective variation management for pseudo periodical streams

Published: 11 June 2007 Publication History

Abstract

Many database applications require the analysis and processing of data streams. In such systems, huge amounts of data arrive rapidly and their values change over time. The variations on streams typically imply some fundamental changes of the underlying objects and possess significant domain meanings. In some data streams, successive events seem to recur in a certain time interval, but the data indeed evolves with tiny differences as time elapses. This feature is called pseudo periodicity, which poses a non-trivial challenge to variation management in data streams. This paper presents our research effort in online variation management over such streams, and the idea can be applied to the problem domain of medical applications, such as patient vital signal monitoring. We propose a new method named Pattern Growth Graph (PGG) to detect and manage variations over pseudo periodical streams. PGG adopts the wave-pattern to capture the major information of data evolution and represent them compactly. With the help of wave-pattern matching algorithm, PGG detects the stream variations in a single pass over the stream data. PGG only stores the different segments of the pattern for incoming stream, and hence it can substantially compress the data without losing important information. The statistical information of PGG helps to distinguish meaningful data changes from noise and to reconstruct the stream with acceptable accuracy. Extensive experiments on real datasets containing millions of data items demonstrate the feasibility and effectiveness of the proposed scheme.

References

[1]
Spiros Papadimitriou, Philip S. Yu: Optimal Multi-Scale Patterns in Time Series Streams, In SIGMOD 2006.
[2]
Charu C. Aggarwal, Jiawei Han, Jianyong Wang, Philip S. Yu: A Framework for Projected Clustering of High Dimensional Data Streams, In VLDB 2004.
[3]
Haixun Wang, Wei Fan, Philip S. Yu, Jiawei Han. Mining Concept Drifting Data Streams using Ensemble Classifiers. In SIGKDD 2003.
[4]
Brian Babcock, Mayur Datar, Rajeev Motwani, Liadan O'Callaghan: Maintaining variance and k-medians over data stream windows. In PODS 2003.
[5]
Jian Pei, Haixun Wang and Philip. S. Yu: Online Mining Data Streams: Problems, Applications and Progress. In ICDE 2005.
[6]
Eamonn J. Keogh, Jessica Lin, Ada Wai-Chee Fu: HOT SAX: Efficiently Finding the Most Unusual Time Series Subsequence. In ICDM 2005.
[7]
Eamonn J. Keogh, Stefano Lonardi, Bill Yuan-chi Chiu: Finding surprising patterns in a time series database in linear time and space. In SIGKDD 2002.
[8]
Spiros Papadimitriou, Jimeng Sun, Christos Faloutsos: Streaming Pattern Discovery in Multiple Time-Series. In VLDB 2005.
[9]
Shivnath Babu and Jennifer Widom: Continuous Queries over Data Streams. In SIGMOD 2001.
[10]
Daniel J. Abadi, Don Carney, Ugur Cetinteme, et al: Aurora: A New Model and Architecture for Data Stream Management. In VLDB Journal, August 2003.
[11]
Sirish Chandrasekaran, Owen Cooper, Amol Deshpande, et al: TelegraphCQ: Continuous Dataflow Processing for an Uncertain World. In CIDR 2003.
[12]
Yong Yao, Johannes Gehrke: The Cougar Approach to In-Network Query Processing in Sensor Networks. SIGMOD Record, 31(3). September 2002.
[13]
Babcock Brian, Datar Mayur and Motwani Rajeev: Sampling from a moving window over streaming data. In SODA, 2002.
[14]
Graham Cormode, Mayur Datar, Piotr Indyk, S. Muthukrishnan: Comparing data streams using hamming norms. In VLDB, 2002.
[15]
Mayur Datar, Aristides Gionis, Piotr Indyk and Bajeev Motwani. Maintaining stream statistics over sliding windows. In SODA, 2002.
[16]
Sumit Ganguly, Minos Garofalakis and Rajeev Rastogi: Processing set expressions over continuous update streams. In SIGMOD, 2003.
[17]
Wong David, Gallegos Yvonne, Weinger Matthew, et al: Changes in intensive care unit nurse task activity after installation of third-generation intensive care unit information system. Crit Care Medi, 2002, 31.
[18]
David J. Fraenkel, Melleesa Cowie, Peter Daley: Quality benefits of an intensive care clinical information system. Crit Care Medi, 2003, 31.
[19]
Varon Joseph, Marik PE: Clinical information systems and the electronic medical record in the intensive care unit. Current Option in Critical Care. 2002, 8(6), 616--624.
[20]
Y. Dora Cai, David Clutter, et al: MAIDS: Mining Alarming Incidents from Data Streams, In SIGMOD 2004.
[21]
WeiGuang Teng, MingSyan Chen, Philip S. Yu. A regression-based temporal pattern mining scheme for data streams. In VLDB 2003.
[22]
Yunyue Zhu, Dennis Shasha: Efficient elastic burst detection in data streams. In SIGKDD 2003.
[23]
Junshui Ma and Simon Perkins: Online novelty detection on temporal sequences. In SIGKDD 2003.
[24]
Charu C. Aggarwal: On Abnormality Detection in Spuriously Populated Data Streams. In ACM SIAM 2005.
[25]
Jessica Lin, Eamonn Keogh, Stefano Lonardi, Bill Chiu: A symbolic representation of time series, with implications for streaming algorithms. In DMKD 2003.
[26]
Anna C. Gilbert, Yannis Kotidis, S. Muthukrishnan and Martin J. Strauss: One-Pass Wavelet Decompositions of Data Streams. IEEE Trans. Knowl. Data Eng. 15(3), 2003.
[27]
Spiros Papadimitriou, Anthony Brockwell and Christos Faloutsos: Adaptive, unsupervised stream mining. VLDB Journal 13(3), 2004.
[28]
Like Gao, Xiaoyang Sean Wang: Continuous Similarity-Based Queries on Streaming Time Series. IEEE Trans. Knowl. Data Eng. 17(10) 2005.
[29]
Huanmei Wu, Betty Salzberg, Donghui Zhang: Online Event-driven Subsequence Matching over Financial Data Streams. In SIGMOD 2004.
[30]
Huanmei Wu, Gregory C Sharp, et al: A Finite State Model for Respiratory Motion Analysis in Image Guided Radiation Therapy. Phys. Med. Biol., 2004.
[31]
Huanmei Wu, Betty Salzberg, Gregory C Sharp, Steve B Jiang, Hiroki Shirato, David Kaeli: Subsequence Matching on Structured Time Series Data. In SIGMOD 2005.
[32]
Charu C. Aggarwal: A framework for diagnosing changes in evolving datastreams. In SIGMOD 2003.
[33]
Haixun Wang and Jian Pei: A Random Method for Quantifying Changing Distributions in Data Streams. In PKDD 2005.
[34]
Eamonn Keogh, Selina Chu, David Hart, et al: An Online Algorithm for Segmenting Time Series. In ICDM 2001.
[35]
Xiaoye Wang, Zhengou Wang: A structure-adaptive piece-wise linear segments representation for time series. In Proceedings of Information Reuse and Integration 2004.
[36]
http://peer.berkeley.edu/nga/flatfile.html
[37]
http://www.schoolsobservatory.org

Cited By

View all
  • (2023)IoT Motion Tracking System for Workout Performance Evaluation: A Case Study on DumbbellIEEE Transactions on Consumer Electronics10.1109/TCE.2023.332018369:4(798-808)Online publication date: Nov-2023
  • (2023)TADS: Temporal Autoencoder Dynamic Series Framework for Unsupervised Anomaly Detection2023 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN54540.2023.10191671(01-08)Online publication date: 18-Jun-2023
  • (2023)Explainable Anomaly Detection System for Categorical Sensor Data in Internet of ThingsMachine Learning and Knowledge Discovery in Databases10.1007/978-3-031-26422-1_37(594-598)Online publication date: 18-Mar-2023
  • Show More Cited By

Index Terms

  1. Effective variation management for pseudo periodical streams

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGMOD '07: Proceedings of the 2007 ACM SIGMOD international conference on Management of data
    June 2007
    1210 pages
    ISBN:9781595936868
    DOI:10.1145/1247480
    • General Chairs:
    • Lizhu Zhou,
    • Tok Wang Ling,
    • Program Chair:
    • Beng Chin Ooi
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 11 June 2007

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. date stream
    2. pattern growth
    3. pseudo periodicity
    4. variation management

    Qualifiers

    • Article

    Conference

    SIGMOD/PODS07
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 785 of 4,003 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)7
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 02 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)IoT Motion Tracking System for Workout Performance Evaluation: A Case Study on DumbbellIEEE Transactions on Consumer Electronics10.1109/TCE.2023.332018369:4(798-808)Online publication date: Nov-2023
    • (2023)TADS: Temporal Autoencoder Dynamic Series Framework for Unsupervised Anomaly Detection2023 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN54540.2023.10191671(01-08)Online publication date: 18-Jun-2023
    • (2023)Explainable Anomaly Detection System for Categorical Sensor Data in Internet of ThingsMachine Learning and Knowledge Discovery in Databases10.1007/978-3-031-26422-1_37(594-598)Online publication date: 18-Mar-2023
    • (2022)Analysis of ID Sequences Similarity Using DTW in Intrusion Detection for CAN BusIEEE Transactions on Vehicular Technology10.1109/TVT.2022.318511171:10(10426-10441)Online publication date: Oct-2022
    • (2022)Anomaly Detection in Quasi-Periodic Time Series Based on Automatic Data Segmentation and Attentional LSTM-CNNIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2020.301480634:6(2626-2640)Online publication date: 1-Jun-2022
    • (2022)Data Mining for Cyber𠄁Physical SystemsData Mining and Machine Learning Applications10.1002/9781119792529.ch10(235-280)Online publication date: 28-Jan-2022
    • (2019)Anomaly Detection and Identification in Satellite Telemetry Data Based on Pseudo-PeriodApplied Sciences10.3390/app1001010310:1(103)Online publication date: 20-Dec-2019
    • (2017)Assessing Death Risk of Patients with Cardiovascular Disease from Long-Term Electrocardiogram Streams SummarizationAdvances in Knowledge Discovery and Data Mining10.1007/978-3-319-57454-7_52(671-682)Online publication date: 23-Apr-2017
    • (2016)Supervised Anomaly Detection in Uncertain Pseudoperiodic Data StreamsACM Transactions on Internet Technology10.1145/280689016:1(1-20)Online publication date: 22-Jan-2016
    • (2014)A Segment-Wise Method for Pseudo Periodic Time Series PredictionAdvanced Data Mining and Applications10.1007/978-3-319-14717-8_36(461-474)Online publication date: 2014
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media