Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/2063576.2063957acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
poster

Mining frequent patterns across multiple data streams

Published: 24 October 2011 Publication History

Abstract

Mining frequent patterns from data streams has drawn increasing attention in recent years. However, previous mining algorithms were all focused on a single data stream. In many emerging applications, it is of critical importance to combine multiple data streams for analysis. For example, in real-time news topic analysis, it is necessary to combine multiple news report streams from dierent media sources to discover collaborative frequent patterns which are reported frequently in all media, and comparative frequent patterns which are reported more frequently in a media than others. To address this problem, we propose a novel frequent pattern mining algorithm Hybrid-Streaming, H-Stream for short. H-Stream builds a new Hybrid-Frequent tree to maintain historical frequent and potential frequent itemsets from all data streams, and incrementally updates these itemsets for efficient collaborative and comparative pattern mining. Theoretical and empirical studies demonstrate the utility of the proposed method.

References

[1]
C. Giannella, J. Han, J. Pei, X. Yan, and P. S. Yu. Mining Frequent Patterns in Data Streams at Multiple Time Granularities. AAAI/MIT, 2003.
[2]
G. Manku and R. Motwani. Approximate frequency counts over data streams. In Proc. of VLDB 2002.
[3]
M. Charikar, K. Chen, and M. Colton. Finding frequent item in data streams. Theoretical Computer Science, 312, 2004.
[4]
P. Zhang, J. Li, P. Wang, B. Gao, X. Zhu, and L. Guo. Enabling fast prediction for ensemble models on data streams. In Proc. of KDD 2011.
[5]
P. Zhang, X.Zhu, Y. Shi, L.Guo, and X. Wu. Robust ensemble learning for mining noisy data streams. Decision Support Systems, 50(2):469--479, 2011.
[6]
M. Yeh and et al. Clustering over multiple evolving streams by events and correlations. IEEE TKDE, 19, 2007.
[7]
P. Zhang, X. Zhu, J. Tan, and L. Guo. Skif: A data imputation framework for concetp drifting data streams. In Proc. of CIKM 2010.
[8]
V. Hristidis and et al. Information discovery across multiple streams. Information Sciences, 179(19):3268--3285, 2009.
[9]
Y. Xu, K. Wang, A. Fu, R. She, and J. Pei. Privacy-Preserving Data Mining, Models and Algorithms. Springer, 2008.
[10]
X. Zhu and X. Wu. Discovering relational patterns across multiple databases. In Proc. of ICDE 2007.
[11]
J. Han, J.Pei, and Y. Yin. Mining frequent patterns without candidate generation. In Proc. of SIGMOD 2000.
[12]
http://www.cs.indiana.edu/ cgiannel/assoc_gen.html.

Cited By

View all
  • (2024)Graph Stream Compression Scheme Based on Pattern Dictionary Using ProvenanceApplied Sciences10.3390/app1411455314:11(4553)Online publication date: 25-May-2024
  • (2022)Mining Frequent Patterns from Temporal Dataset Using Backtracking Search Tree of GenMax AlgorithmHuman-Centric Smart Computing10.1007/978-981-19-5403-0_9(105-117)Online publication date: 29-Nov-2022
  • (2020) Top- closed co-occurrence patterns mining with differential privacy over multiple streams Future Generation Computer Systems10.1016/j.future.2020.04.049111(339-351)Online publication date: Oct-2020
  • Show More Cited By

Index Terms

  1. Mining frequent patterns across multiple data streams

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    CIKM '11: Proceedings of the 20th ACM international conference on Information and knowledge management
    October 2011
    2712 pages
    ISBN:9781450307178
    DOI:10.1145/2063576
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 24 October 2011

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. data stream mining
    2. frequent pattern mining
    3. multiple data streams

    Qualifiers

    • Poster

    Conference

    CIKM '11
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

    Upcoming Conference

    CIKM '25

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)4
    • Downloads (Last 6 weeks)2
    Reflects downloads up to 12 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Graph Stream Compression Scheme Based on Pattern Dictionary Using ProvenanceApplied Sciences10.3390/app1411455314:11(4553)Online publication date: 25-May-2024
    • (2022)Mining Frequent Patterns from Temporal Dataset Using Backtracking Search Tree of GenMax AlgorithmHuman-Centric Smart Computing10.1007/978-981-19-5403-0_9(105-117)Online publication date: 29-Nov-2022
    • (2020) Top- closed co-occurrence patterns mining with differential privacy over multiple streams Future Generation Computer Systems10.1016/j.future.2020.04.049111(339-351)Online publication date: Oct-2020
    • (2020)Anytime Frequent Itemset Mining of Transactional Data StreamsBig Data Research10.1016/j.bdr.2020.100146(100146)Online publication date: Jul-2020
    • (2018)Mining Top-k Co-Occurrence Patterns across Multiple Streams (Extended Abstract)2018 IEEE 34th International Conference on Data Engineering (ICDE)10.1109/ICDE.2018.00231(1747-1748)Online publication date: Apr-2018
    • (2018)Adaptive stratified reservoir sampling over heterogeneous data streamsInformation Systems10.1016/j.is.2012.03.00539(199-216)Online publication date: 30-Dec-2018
    • (2018)Improved algorithm for parallel mining collaborative frequent itemsets in multiple data streamsCluster Computing10.1007/s10586-018-1859-yOnline publication date: 30-Jan-2018
    • (2017)Mining Top-k Co-Occurrence Patterns across Multiple StreamsIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2017.272853729:10(2249-2262)Online publication date: 1-Oct-2017
    • (2017)A frequent itemset reduction algorithm for global pattern mining on distributed data streams2017 Tenth International Conference on Contemporary Computing (IC3)10.1109/IC3.2017.8284320(1-6)Online publication date: Aug-2017
    • (2017)A comparative analysis of frequent pattern mining algorithms used for streaming data2017 International Conference on Computing, Communication and Automation (ICCCA)10.1109/CCAA.2017.8229809(250-255)Online publication date: May-2017
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media