Adaptive Self-Sufficient Itemset Miner for Transactional Data Streams

Feiyang Tang¹⁰,
David Tse Jung Huang¹⁰,
Yun Sing Koh¹⁰ &
…
Philippe Fournier-Viger¹¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11671))

Included in the following conference series:

Pacific Rim International Conference on Artificial Intelligence

2824 Accesses
1 Citations

Abstract

Most studies on pattern mining consider itemsets that have a high frequency of occurrence as useful, often determined by the support of the itemsets. However, current research has shown that we need to move beyond a pure “support-confidence” framework for pattern mining. Recently, there is an interest on finding statistically significant patterns and one of the most popular type of patterns is self-sufficient itemsets. One limitation is that these works do not consider concept drifts and cannot be used in a data stream. Learning in the online environment requires us to develop efficient and effective mechanisms to address the online characteristics of non-static data and non-stationary data distributions. In our research we will concentrate on detecting self-sufficient itemsets from data streams. These patterns have a frequency that is significantly different from the frequency of their subsets and supersets. We present a comprehensive framework for mining self-sufficient itemsets from data streams along with a drift detector. This supports mining self-sufficient itemsets in an online environment and provides the ability to adapt to changes in the stream. Our experimental evaluations show that our framework can mine self-sufficient itemsets faster in an online environment and with better precision and recall.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

An Efficient Approach for Mining High Utility Itemsets Over Data Streams

A Reinduction-Based Approach for Efficient High Utility Itemset Mining from Incremental Datasets

Article Open access 29 September 2023

SS-FIM: Single Scan for Frequent Itemsets Mining in Transactional Databases

References

Agrawal, R., Imielinski, T., Swami, A.: Mining association rules between sets of items in large databases. In: SIGMOD Conference, vol. 22, p. 207 (1993)
Article Google Scholar
Bayardo, R.J., Agrawal, R.: Mining the most interesting rules. In: Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 145–154 (1999)
Google Scholar
Bayardo, R.J., Agrawal, R., Gunopulos, D.: Constraint-based rule mining in large, dense databases. Data Min. Knowl. Disc. 4(2), 217–240 (2000)
Article Google Scholar
Bifet, A., Gavalda, R.: Learning from time-changing data with adaptive windowing. In: Proceedings of the 2007 SIAM International Conference on Data Mining, pp. 443–448 (2007)
Google Scholar
Brijs, T., Swinnen, G., Vanhoof, K., Wets, G.: Using association rules for product assortment decisions: a case study. In: Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 254–260 (1999)
Google Scholar
Dua, D., Karra Taniskidou, E.: UCI machine learning repository (2017). http://archive.ics.uci.edu/ml
Gama, J., Žliobaitė, I., Bifet, A., Pechenizkiy, M., Bouchachia, A.: A survey on concept drift adaptation. ACM Comput. Surv. 46(4), 44:1–44:37 (2014)
Article MATH Google Scholar
Hamalainen, W.: Kingfisher: an efficient algorithm for searching for both positive and negative dependency rules with statistical significance measures. Knowl. Inf. Syst. 32, 1–32 (2011)
Google Scholar
Harel, M., Crammer, K., El-Yaniv, R., Mannor, S.: Concept drift detection through resampling. In: Proceedings of the 31st International Conference on International Conference on Machine Learning, vol. 32, pp. II-1009–II-1017 (2014)
Google Scholar
Hettich, S., Bay, S.D.: Irvine, CA (1999). http://kdd.ics.uci.edu
Kohavi, R., Brodley, C., Frasca, B., Mason, L., Zheng, Z.: KDD-cup 2000 organizers’ report. SIGKDD Explor. 2, 86–98 (2000)
Article Google Scholar
Liu, A., Zhang, G., Lu, J.: Fuzzy time windowing for gradual concept drift adaptation. In: Proceedings of the 2017 IEEE International Conference on Fuzzy Systems, pp. 1–6. IEEE (2017)
Google Scholar
Manku, G.S., Motwani, R.: Approximate frequency counts over data streams. In: Proceedings of the 28th International Conference on Very Large Data Bases, pp. 346–357 (2002)
Chapter Google Scholar
Narayanan, R., Honbo, D., Memik, G., Choudhary, A., Zambreno, J.: NU-MineBench (2018). http://cucis.ece.northwestern.edu/index.html
Newman, C.B.D., Merz, C.: UCI repository of machine learning databases (1998). http://www.ics.uci.edu/~mlearn/MLRepository.html
Nguyen, H.L., Woon, Y.K., Ng, W.K.: A survey on data stream clustering and classification. Knowl. Inf. Syst. 45, 535–569 (2014)
Article Google Scholar
Piatetsky-Shapiro, G.: Discovery, analysis, and presentation of strong rules. Knowl. Discovery Databases, 229–238 (1991)
Google Scholar
Webb, G.: Discovering significant patterns. Mach. Learn. 68(1), 1–33 (2007)
Article Google Scholar
Webb, G.: Self-sufficient itemsets: an approach to screening potentially interesting associations between items. ACM Trans. Knowl. Discov. Data 4, 1–20 (2010)
Article Google Scholar
Webb, G.: Filtered-top-k association discovery. WIREs Data Mining Knowl. Discov. 1(3), 183–192 (2011)
Article Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Science, The University of Auckland, Auckland, New Zealand
Feiyang Tang, David Tse Jung Huang & Yun Sing Koh
School of Humanities and Social Sciences, Harbin Institute of Technology (Shenzhen), Shenzhen, China
Philippe Fournier-Viger

Authors

Feiyang Tang
View author publications
You can also search for this author in PubMed Google Scholar
David Tse Jung Huang
View author publications
You can also search for this author in PubMed Google Scholar
Yun Sing Koh
View author publications
You can also search for this author in PubMed Google Scholar
Philippe Fournier-Viger
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Feiyang Tang or David Tse Jung Huang .

Editor information

Editors and Affiliations

Department of Computing, Macquarie University, Sydney, NSW, Australia
Abhaya C. Nayak
RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
Alok Sharma

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Tang, F., Huang, D.T.J., Koh, Y.S., Fournier-Viger, P. (2019). Adaptive Self-Sufficient Itemset Miner for Transactional Data Streams. In: Nayak, A., Sharma, A. (eds) PRICAI 2019: Trends in Artificial Intelligence. PRICAI 2019. Lecture Notes in Computer Science(), vol 11671. Springer, Cham. https://doi.org/10.1007/978-3-030-29911-8_32

Download citation

DOI: https://doi.org/10.1007/978-3-030-29911-8_32
Published: 23 August 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-29910-1
Online ISBN: 978-3-030-29911-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Adaptive Self-Sufficient Itemset Miner for Transactional Data Streams

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

An Efficient Approach for Mining High Utility Itemsets Over Data Streams

A Reinduction-Based Approach for Efficient High Utility Itemset Mining from Incremental Datasets

SS-FIM: Single Scan for Frequent Itemsets Mining in Transactional Databases

References

Author information

Authors and Affiliations

Corresponding authors

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Adaptive Self-Sufficient Itemset Miner for Transactional Data Streams

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

An Efficient Approach for Mining High Utility Itemsets Over Data Streams

A Reinduction-Based Approach for Efficient High Utility Itemset Mining from Incremental Datasets

SS-FIM: Single Scan for Frequent Itemsets Mining in Transactional Databases

References

Author information

Authors and Affiliations

Corresponding authors

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation