Abstract
Most studies on pattern mining consider itemsets that have a high frequency of occurrence as useful, often determined by the support of the itemsets. However, current research has shown that we need to move beyond a pure “support-confidence” framework for pattern mining. Recently, there is an interest on finding statistically significant patterns and one of the most popular type of patterns is self-sufficient itemsets. One limitation is that these works do not consider concept drifts and cannot be used in a data stream. Learning in the online environment requires us to develop efficient and effective mechanisms to address the online characteristics of non-static data and non-stationary data distributions. In our research we will concentrate on detecting self-sufficient itemsets from data streams. These patterns have a frequency that is significantly different from the frequency of their subsets and supersets. We present a comprehensive framework for mining self-sufficient itemsets from data streams along with a drift detector. This supports mining self-sufficient itemsets in an online environment and provides the ability to adapt to changes in the stream. Our experimental evaluations show that our framework can mine self-sufficient itemsets faster in an online environment and with better precision and recall.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Agrawal, R., Imielinski, T., Swami, A.: Mining association rules between sets of items in large databases. In: SIGMOD Conference, vol. 22, p. 207 (1993)
Bayardo, R.J., Agrawal, R.: Mining the most interesting rules. In: Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 145–154 (1999)
Bayardo, R.J., Agrawal, R., Gunopulos, D.: Constraint-based rule mining in large, dense databases. Data Min. Knowl. Disc. 4(2), 217–240 (2000)
Bifet, A., Gavalda, R.: Learning from time-changing data with adaptive windowing. In: Proceedings of the 2007 SIAM International Conference on Data Mining, pp. 443–448 (2007)
Brijs, T., Swinnen, G., Vanhoof, K., Wets, G.: Using association rules for product assortment decisions: a case study. In: Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 254–260 (1999)
Dua, D., Karra Taniskidou, E.: UCI machine learning repository (2017). http://archive.ics.uci.edu/ml
Gama, J., Žliobaitė, I., Bifet, A., Pechenizkiy, M., Bouchachia, A.: A survey on concept drift adaptation. ACM Comput. Surv. 46(4), 44:1–44:37 (2014)
Hamalainen, W.: Kingfisher: an efficient algorithm for searching for both positive and negative dependency rules with statistical significance measures. Knowl. Inf. Syst. 32, 1–32 (2011)
Harel, M., Crammer, K., El-Yaniv, R., Mannor, S.: Concept drift detection through resampling. In: Proceedings of the 31st International Conference on International Conference on Machine Learning, vol. 32, pp. II-1009–II-1017 (2014)
Hettich, S., Bay, S.D.: Irvine, CA (1999). http://kdd.ics.uci.edu
Kohavi, R., Brodley, C., Frasca, B., Mason, L., Zheng, Z.: KDD-cup 2000 organizers’ report. SIGKDD Explor. 2, 86–98 (2000)
Liu, A., Zhang, G., Lu, J.: Fuzzy time windowing for gradual concept drift adaptation. In: Proceedings of the 2017 IEEE International Conference on Fuzzy Systems, pp. 1–6. IEEE (2017)
Manku, G.S., Motwani, R.: Approximate frequency counts over data streams. In: Proceedings of the 28th International Conference on Very Large Data Bases, pp. 346–357 (2002)
Narayanan, R., Honbo, D., Memik, G., Choudhary, A., Zambreno, J.: NU-MineBench (2018). http://cucis.ece.northwestern.edu/index.html
Newman, C.B.D., Merz, C.: UCI repository of machine learning databases (1998). http://www.ics.uci.edu/~mlearn/MLRepository.html
Nguyen, H.L., Woon, Y.K., Ng, W.K.: A survey on data stream clustering and classification. Knowl. Inf. Syst. 45, 535–569 (2014)
Piatetsky-Shapiro, G.: Discovery, analysis, and presentation of strong rules. Knowl. Discovery Databases, 229–238 (1991)
Webb, G.: Discovering significant patterns. Mach. Learn. 68(1), 1–33 (2007)
Webb, G.: Self-sufficient itemsets: an approach to screening potentially interesting associations between items. ACM Trans. Knowl. Discov. Data 4, 1–20 (2010)
Webb, G.: Filtered-top-k association discovery. WIREs Data Mining Knowl. Discov. 1(3), 183–192 (2011)
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Tang, F., Huang, D.T.J., Koh, Y.S., Fournier-Viger, P. (2019). Adaptive Self-Sufficient Itemset Miner for Transactional Data Streams. In: Nayak, A., Sharma, A. (eds) PRICAI 2019: Trends in Artificial Intelligence. PRICAI 2019. Lecture Notes in Computer Science(), vol 11671. Springer, Cham. https://doi.org/10.1007/978-3-030-29911-8_32
Download citation
DOI: https://doi.org/10.1007/978-3-030-29911-8_32
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-29910-1
Online ISBN: 978-3-030-29911-8
eBook Packages: Computer ScienceComputer Science (R0)