Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/502585.502630acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
Article

Sliding-window filtering: an efficient algorithm for incremental mining

Published: 05 October 2001 Publication History

Abstract

We explore in this paper an effective sliding-window filtering (abbreviatedly as SWF) algorithm for incremental mining of association rules. In essence, by partitioning a transaction database into several partitions, algorithm SWF employs a filtering threshold in each partition to deal with the candidate itemset generation. Under SWF, the cumulative information of mining previous partitions is selectively carried over toward the generation of candidate itemsets for the subsequent partitions. Algorithm SWF not only significantly reduces I/O and CPU cost by the concepts of cumulative filtering and scan reduction techniques but also effectively controls memory utilization by the technique of sliding-window partition. Algorithm SWF is particularly powerful for efficient incremental mining for an ongoing time-variant transaction database. By utilizing proper scan reduction techniques, only one scan of the incremented dataset is needed by algorithm SWF. The I/O cost of SWF is, in orders of magnitude, smaller than those required by prior methods, thus resolving the performance bottleneck. Experimental studies are performed to evaluate performance of algorithm SWF. It is noted that the improvement achieved by algorithm SWF is even more prominent as the incremented portion of the dataset increases and also as the size of the database increases.

References

[1]
R. Agarwal, C. Aggarwal, and V.V.V. Prasad. A Tree Projection Algorithm for Generation of Frequent Itemsets. Jornal of Parallel and Distributed Computing (Special Issue on High Performance Data Mining), 2000.]]
[2]
R. Agrawal, T. Imielinski, and A. Swami. Mining Association Rules between Sets of Items in Large Databases. Proc. of ACM SIGMOD, pages 207-216, May 1993.]]
[3]
R. Agrawal and R. Srikant. Fast Algorithms for Mining Association Rules in Large Databases. Proc. oj the 20th International Conference on Very Large Data Bases, pages 478-499, September 1994.]]
[4]
N.F. Ayan, A.U. Tansel, and E. Arkun. An Efficient Algorithm to Update Large Itemsets with Early Pruning. Proc. of 1999 Int. Conf. on Knowledge Discovery and Data Mining, 1999.]]
[5]
S. Brin, R. Motwani, J. D. Ullman, and S. Tsur. Dynamic Itemset Counting and Implication Rules for Market Basket Data. ACM SIGMOD Record, 26(2):255-264, May 1997.]]
[6]
M.-S. Chen, J. Han, and P. S. Yu. Data Mining: An Overview from Database Perspective. IEEE Transactions on Knowledge and Data Engineering, 8(6):866-883, December 1996.]]
[7]
M.-S. Chen, J.-S. Park, and P. S. Yu. Efficient Data Mining for Path Traversal Patterns. IEEE Transactions on Knowledge and Data Engineering, 10(2):209-221, April 1998.]]
[8]
D. Cheung, J. Han, V. Ng, and C.Y. Wong. Maintenance of Discovered Association Rules in Large Databases: An Incremental Updating Technique. Proc. of 1996 Int'l Conf. on Data Engineering, pages 106-114, February 1996.]]
[9]
D. Cheung, SD. Lee, and B. Kao. A General Incremental Technique for Updating Discovered Association Rules. Proc. International Conference On Database Systems For Advanced Applications, April 1997.]]
[10]
J. Han, L. V. S. Lakshmanan, and R. T. Ng. Constraint-Based, Multidimensional Data Mining. COMPUTER (special issues on Data Mining), pages 46-50, 1999.]]
[11]
J. Han and J. Pei. Mining Frequent Patterns by Pattern-Growth: Methodology and Implications. ACM SIGKDD Explorations (Special Issue on Scaleble Data Mining Algorithms), December 2000.]]
[12]
J. Han, J. Pei, B. Mortazavi-Asl, Q. Chen, U. Dayal, and M.-C. Hsu. FreeSpan: Frequent pattern-projected sequential pattern mining. Proc. of 2000 Int. Conf. on Knowledge Discovery and Data Mining, pages 355-359, August 2000.]]
[13]
J. Hipp, U. Gtintzer, and G. Nakhaeizadeh. Algorithms for association rule mining - a general survey and comparison. SIGKDD Explorations, 2(1):58-64, July 2000.]]
[14]
L. V. S. Lakshmanan, R. Ng, J. Han, and A. Pang. Optimization of Constrained Frequent Set Queries with 2-Variable Constraints. Proc. of 1999 ACM-SIGMOD Conf. on Management of Data, pages 157-168, June 1999.]]
[15]
J.-L. Lin and M.H. Dunham. Mining Association Rules: Anti-Skew Algorithms. Proc. of 1998 Int'l Conf. on Data Engineering, pages 486-493, 1998.]]
[16]
J.-S. Park, M.-S. Chen, and P. S. Yu. Using a Hash-Based Method with Transaction Trimming for Mining Association Rules. IEEE l?ansactions on Knowledge and Data Engineering, 9(5):813-825, October 1997.]]
[17]
J. Pei and J. Han. Can We Push More Constraints into Frequent Pattern Mining? Proc. of 2000 Int. Conf. on Knowledge Discovery and Data Mining, August 2000.]]
[18]
J. Pei, J. Han, and L.V.S. Lakshmanan. Mining Frequent Itemsets with Convertible Constraints. Proc. of 2001 Innt. Conf. on Data Engineering, 2001.]]
[19]
A. Savasere, E. Omiecinski, and S. Navathe. An Efficient Algorithm for Mining Association Rules in Large Databases. Proc. of the 21th International Conference on Very Large Data Bases, pages 432-444, September 1995.]]
[20]
R. Srikant and R. Agrawal. Mining Generalized Association Rules. Proc. of the 21th International Conference on Very Large Data Bases, pages 407-419, September 1995.]]
[21]
S. Thomas, S. Bodagala, K. Alsabti, and S. Ranka. An Efficient Algorithm for the Incremental Updation of Association Rules in Large Databases. Proc. of 1997 Ink Conf. on Knowledge Discovery and Data Mining, 1997.]]
[22]
H. Toivonen. Sampling Large Databases for Association Rules. Proc. of the 22th VLDB Conference, pages 134-145, September 1996.]]
[23]
A. K. H. Tung, J. Han, L. V. S. Lakshmanan, and R. T. Ng. Constraint-Based Clustering in Large Databases. Proc. of 2001 Int. Conf. on Database Theory, January 2001.]]
[24]
K. Wang, Y. He, and J. Han. Mining Frequent Itemsets Using Support Constraints. Proc. of 2000 Int. Conf. on Very Large Data Bases, September 2000.]]

Cited By

View all
  • (2025)A Comprehensive Survey of Side-Channel Sound-Sensing MethodsIEEE Internet of Things Journal10.1109/JIOT.2024.350133412:2(1554-1578)Online publication date: 15-Jan-2025
  • (2024)Energy Forecasting: A Comprehensive Review of Techniques and TechnologiesEnergies10.3390/en1707166217:7(1662)Online publication date: 30-Mar-2024
  • (2024)BIZA: Design of Self-Governing Block-Interface ZNS AFA for Endurance and PerformanceProceedings of the ACM SIGOPS 30th Symposium on Operating Systems Principles10.1145/3694715.3695953(313-329)Online publication date: 4-Nov-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
CIKM '01: Proceedings of the tenth international conference on Information and knowledge management
October 2001
616 pages
ISBN:1581134363
DOI:10.1145/502585
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 October 2001

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. association rules
  2. data mining
  3. incremental mining
  4. time-variant database

Qualifiers

  • Article

Conference

CIKM01
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)91
  • Downloads (Last 6 weeks)8
Reflects downloads up to 02 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2025)A Comprehensive Survey of Side-Channel Sound-Sensing MethodsIEEE Internet of Things Journal10.1109/JIOT.2024.350133412:2(1554-1578)Online publication date: 15-Jan-2025
  • (2024)Energy Forecasting: A Comprehensive Review of Techniques and TechnologiesEnergies10.3390/en1707166217:7(1662)Online publication date: 30-Mar-2024
  • (2024)BIZA: Design of Self-Governing Block-Interface ZNS AFA for Endurance and PerformanceProceedings of the ACM SIGOPS 30th Symposium on Operating Systems Principles10.1145/3694715.3695953(313-329)Online publication date: 4-Nov-2024
  • (2024)The Adaptation of Concept Drift: A Fit Prediction Algorithm Based on Local OptimumIEEE Transactions on Computational Social Systems10.1109/TCSS.2023.326459411:4(4944-4954)Online publication date: Aug-2024
  • (2024)Hybrid CNN-LSTM Forecasting Model for Electric Vehicle Charging Demand in Smart Buildings2024 6th Global Power, Energy and Communication Conference (GPECOM)10.1109/GPECOM61896.2024.10582629(590-595)Online publication date: 4-Jun-2024
  • (2024)Acoustic User Authentication with Smartphones2024 IEEE International Conference on Microwaves, Communications, Antennas, Biomedical Engineering and Electronic Systems (COMCAS)10.1109/COMCAS58210.2024.10666194(1-5)Online publication date: 9-Jul-2024
  • (2024)Development and challenges of object detection: A surveyNeurocomputing10.1016/j.neucom.2024.128102598(128102)Online publication date: Sep-2024
  • (2024)An Optimization Method Based on Drift Data and Time Series InformationAdvanced Intelligent Computing Technology and Applications10.1007/978-981-97-5581-3_11(130-141)Online publication date: 1-Aug-2024
  • (2023)Credit Card Fraud Detection using Machine Learning and Data Mining Techniques - a Literature SurveyInternational Journal of Applied Engineering and Management Letters10.47992/IJAEML.2581.7000.0186(16-35)Online publication date: 28-Jul-2023
  • (2023)CF-YOLOX: An Autonomous Driving Detection Model for Multi-Scale Object DetectionSensors10.3390/s2308379423:8(3794)Online publication date: 7-Apr-2023
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media