Abstract
We study the problem of mining informative (association) rule set for prediction over data streams. On dense datasets and low minimum support threshold, the generating of informative rule set does not use all mined frequent itemsets (FIs). Therefore, we will waste a portion of FIs if we run existing algorithms for finding FIs from data streams as the first stage to mine informative rule set. We propose an algorithm for mining informative rule set directly from data streams over a sliding window. Our experiments show that our algorithm not only attains high accurate results but also out performs the two-stage process, find FIs and then generate rules, of mining informative rule set.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Ao, F., Du, J., Yan, Y., Liu, B., Huang, K.: An Efficient Algorithm for Mining Closed Frequent Itemsets in Data Streams. In: Proceedings of the 2008 IEEE 8th International Conference on Computer and Information Technology Workshops (2008)
Calders, T., Dexters, N., Goethals, B.: Mining Frequent Itemsets in a Stream. In: Seventh IEEE International Conference on Data Mining, Omaha, NE, pp. 83–92 (2007)
Chang, J.H., Lee, W.S.: Finding recent frequent itemsets adaptively over online data streams. In: Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, Washington, D.C., pp. 487–492 (2002)
Cheng, J., Ke, Y., Ng, W.: Maintaining Frequent Itemsets over High-Speed Data Streams. In: Ng, W.-K., Kitsuregawa, M., Li, J., Chang, K. (eds.) PAKDD 2006. LNCS (LNAI), vol. 3918, pp. 462–467. Springer, Heidelberg (2006)
Cheng, J., Ke, Y., Ng, W.: A survey on algorithms for mining frequent itemsets over data streams. Knowledge and Information Systems 16(1), 1–27 (2008)
Cheng, J., Ke, Y., Ng, W.: Maintaining frequent closed itemsets over a sliding window. Journal of Intelligent Information Systems 31(3), 191–215 (2008)
Chi, Y., Wang, H., Yu, P.S., Muntz, R.R.: Moment: maintaining closed frequent itemsets over a stream sliding window. In: Fourth IEEE International Conference on Data Mining, Brighton, UK, November 1-4, pp. 59–66 (2004)
Chi, Y., Wang, H., Yu, P.S., Muntz, R.R.: Catch the moment: maintaining closed frequent itemsets over a data stream sliding window. In: Knowledge and Information Systems, October 2006, vol. 10, pp. 265–294. Springer, Heidelberg (2006)
Giannella, C., Han, J., Pei, J., Yan, X., Yu, P.S.: Mining Frequent Patterns in Data Streams at Multiple Time Granularities. In: Data mining: next generation challenges and the future directions, pp. 191–212. MIT/AAAI Press (2004)
Jiang, N., Gruenwald, L.: CFI-Stream: mining closed frequent itemsets in data streams. In: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, Philadelphia, PA, USA, pp. 592–597 (2006)
Li, H.F., Ho, C.C., Shan, M.K., Lee, S.Y.: Efficient Maintenance and Mining of Frequent Itemsets over Online Data Streams with a Sliding Window. In: IEEE International Conference on Systems, Man and Cybernetics, 2006, Taipei, pp. 2672–2677 (2006)
Li, H.F., Lee, S.Y., Shan, M.K.: DSM-FI: an efficient algorithm for mining frequent itemsets in data streams. In: Knowledge and Information Systems, October 2008, vol. 17, pp. 79–97. Springer, New York (2007)
Li, J., Shen, H., Topor, R.: Mining the informative rule set for prediction. Journal of Intelligent Information Systems 22(2), 155–174 (2004)
Li, J., Zhang, Y.: Direct Interesting Rule Generation. In: Proceedings of the Third IEEE International Conference on Data Mining, p. 155 (2003)
Manku, G.S., Motwani, R.: Approximate frequency counts over data streams. In: Proceedings of the 28th international conference on Very Large Data Bases, 2002, Hongkong, China, pp. 346–357 (2002)
Wang, E.T., Chen, A.L.: A novel hash-based approach for mining frequent itemsets over data streams requiring less memory space. Data Mining and Knowledge Discovery 19(1), 132–172 (2009)
Wong, R.C.W., Fu, A.W.C.: Mining top-K frequent itemsets from data streams. Data Mining and Knowledge Discovery 13(2), 193–217 (2006)
Woo, H.J., Lee, W.S.: estMax: Tracing Maximal Frequent Itemsets over Online Data Streams. In: Proceedings of the 2007 Seventh IEEE International Conference on Data Mining, vol. 21(10), pp. 1418–1431 (2007)
Yu, J.X., Chong, Z., Lu, H., Zhou, A.: False positive or false negative: mining frequent itemsets from high speed transactional data streams. In: Proceedings of the Thirtieth international conference on Very large data bases, Toronto, Canada, vol. 30, pp. 204–215 (2004)
Zaki, M.J., Gouda, K.: Fast vertical mining using diffsets. In: Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, Washington DC, pp. 326–335 (2003)
Zaki, M.J.: Mining Non-Redundant Association Rules. In: Data Mining and Knowledge Discovery, November 2004, vol. 9(3). Springer, Netherlands (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Nhan, N.D., Hung, N.T., Bac, L.H. (2010). Mining Informative Rule Set for Prediction over a Sliding Window. In: Nguyen, N.T., Le, M.T., Świątek, J. (eds) Intelligent Information and Database Systems. ACIIDS 2010. Lecture Notes in Computer Science(), vol 5991. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12101-2_44
Download citation
DOI: https://doi.org/10.1007/978-3-642-12101-2_44
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-12100-5
Online ISBN: 978-3-642-12101-2
eBook Packages: Computer ScienceComputer Science (R0)