Mining Frequent Itemsets from Sparse Data Streams in Limited Memory Environments

Juan J. Cameron²¹,
Alfredo Cuzzocrea²²,
Fan Jiang²¹ &
…
Carson K. Leung²¹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7923))

Included in the following conference series:

International Conference on Web-Age Information Management

3672 Accesses
6 Citations

Abstract

Floods of data can be produced in many applications such as Web click streams or wireless sensor networks. Hence, algorithms for mining frequent itemsets from data streams are in demand. Many existing stream mining algorithms capture important streaming data and assume that the captured data can fit into main memory. However, problem arose when the available memory so limited that such an assumption does not hold. In this paper, we present a data structure called DSTable to capture important data from the streams onto the disk. The DSTable can be easily maintained and is applicable for mining frequent itemsets from streams (especially sparse data) in limited memory environments.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Mining Discriminative Itemsets Over Data Streams Using Efficient Sliding Window

Article Open access 27 June 2023

Time-weighted counting for recently frequent pattern mining in data streams

Article 22 March 2017

Memory Efficient Frequent Itemset Mining

References

Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: VLDB 1994, pp. 487–499 (1994)
Google Scholar
Buehrer, G., Parthasarathy, S., Ghoting, A.: Out-of-core frequent pattern mining on a commodity. In: ACM KDD 2006, pp. 86–95 (2006)
Google Scholar
Cameron, J.J., Cuzzocrea, A., Leung, C.K.-S.: Stream mining of frequent sets with limited memory. In: ACM SAC 2013, pp. 173–175 (2013)
Google Scholar
Cao, K., Wang, G., Han, D., Ma, Y., Ma, X.: A framework for high-quality clustering uncertain data stream over sliding windows. In: Gao, H., Lim, L., Wang, W., Li, C., Chen, L. (eds.) WAIM 2012. LNCS, vol. 7418, pp. 308–313. Springer, Heidelberg (2012)
Chapter Google Scholar
Chiu, D.Y., Wu, Y.H., Chen, A.: Efficient frequent sequence mining by a dynamic strategy switching algorithm. VLDB J. 18(1), 303–327 (2009)
Article Google Scholar
Fariha, A., Ahmed, C.F., Leung, C.K.-S., Abdullah, S.M., Cao, L.: Mining frequent patterns from human interactions in meetings using directed acyclic graphs. In: Pei, J., Tseng, V.S., Cao, L., Motoda, H., Xu, G. (eds.) PAKDD 2013, Part I. LNCS (LNAI), vol. 7818, pp. 38–49. Springer, Heidelberg (2013)
Chapter Google Scholar
Gao, C., Wang, J., Yang, Q.: Efficient mining of closed sequential patterns on stream sliding window. In: IEEE ICDM 2011, pp. 1044–1049 (2011)
Google Scholar
Giannella, C., Han, J., Pei, J., Yan, X., Yu, P.S.: Mining frequent patterns in data streams at multiple time granularities. In: Data Mining: Next Generation Challenges and Future Directions, ch. 6 (2004)
Google Scholar
Grahne, G., Zhu, J.: Mining frequent itemsets from secondary memory. In: IEEE ICDM 2004, pp. 91–98 (2004)
Google Scholar
Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: ACM SIGMOD 2000, pp. 1–12 (2000)
Google Scholar
Jiang, X., Xiong, H., Wang, C., Tan, A.-H.: Mining globally distributed frequent subgraphs in a single labeled graph. DKE 68(10), 1034–1058 (2009)
Article Google Scholar
Jin, R., Agrawal, G.: An algorithm for in-core frequent itemset mining on streaming data. In: IEEE ICDM 2005, pp. 210–217 (2005)
Google Scholar
Leung, C.K.-S., Brajczuk, D.A.: Efficient mining of frequent itemsets from data streams. In: Gray, A., Jeffery, K., Shao, J. (eds.) BNCOD 2008. LNCS, vol. 5071, pp. 2–14. Springer, Heidelberg (2008)
Chapter Google Scholar
Leung, C.K.-S., Cuzzocrea, A., Jiang, F.: Discovering frequent patterns from uncertain data streams with time-fading and landmark models. In: Hameurlain, A., Küng, J., Wagner, R., Cuzzocrea, A., Dayal, U. (eds.) TLDKS VIII. LNCS, vol. 7790, pp. 174–196. Springer, Heidelberg (2013)
Google Scholar
Leung, C.K.-S., Hayduk, Y.: Mining frequent patterns from uncertain data with mapReduce for big data analytics. In: Meng, W., Feng, L., Bressan, S., Winiwarter, W., Song, W. (eds.) DASFAA 2013, Part I. LNCS, vol. 7825, pp. 440–455. Springer, Heidelberg (2013)
Google Scholar
Leung, C.K.-S., Khan, Q.I.: DSTree: a tree structure for the mining of frequent sets from data streams. In: IEEE ICDM 2006, pp. 928–932 (2006)
Google Scholar
Leung, C.K.-S., Tanbeer, S.K.: PUF-tree: a compact tree structure for frequent pattern mining of uncertain data. In: Pei, J., Tseng, V.S., Cao, L., Motoda, H., Xu, G. (eds.) PAKDD 2013, Part I. LNCS (LNAI), vol. 7818, pp. 13–25. Springer, Heidelberg (2013)
Chapter Google Scholar
Qu, J.-F., Liu, M.: A high-performance algorithm for frequent itemset mining. In: Gao, H., Lim, L., Wang, W., Li, C., Chen, L. (eds.) WAIM 2012. LNCS, vol. 7418, pp. 71–82. Springer, Heidelberg (2012)
Chapter Google Scholar
Papapetrou, O., Garofalakis, M., Deligiannakis, A.: Sketch-based querying of distributed sliding-window data streams. In: VLDB 2012, pp. 992–1003 (2012)
Google Scholar
Tanbeer, S.K., Leung, C.K.-S.: Finding diverse friends in social networks. In: Ishikawa, Y., Li, J., Wang, W., Zhang, R., Zhang, W. (eds.) APWeb 2013. LNCS, vol. 7808, pp. 301–309. Springer, Heidelberg (2013)
Chapter Google Scholar
Tirthapura, S., Woodruff, D.P.: A general method for estimating correlated aggregates over a data stream. In: IEEE ICDE 2012, pp. 162–173 (2012)
Google Scholar

Download references

Author information

Authors and Affiliations

University of Manitoba, Canada
Juan J. Cameron, Fan Jiang & Carson K. Leung
ICAR-CNR and University of Calabria, Italy
Alfredo Cuzzocrea

Authors

Juan J. Cameron
View author publications
You can also search for this author in PubMed Google Scholar
Alfredo Cuzzocrea
View author publications
You can also search for this author in PubMed Google Scholar
Fan Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Carson K. Leung
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science and Technology, Tsinghua University, 100084, Beijing, China
Jianyong Wang
Management Science and Information Systems Department, Rutgers, the State University of New Jersey, 1, Washington Park, 07102, Newark, NJ, USA
Hui Xiong
Department of Information Engineering, Nagoya University, 464-8601, Nagoya, Japan
Yoshiharu Ishikawa
Department of Computer Science, Hong Kong Baptist University, Hong Kong
Jianliang Xu
School of Information Science and Engineering, Yanshan University, Qinhuangdao, China
Junfeng Zhou

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Cameron, J.J., Cuzzocrea, A., Jiang, F., Leung, C.K. (2013). Mining Frequent Itemsets from Sparse Data Streams in Limited Memory Environments. In: Wang, J., Xiong, H., Ishikawa, Y., Xu, J., Zhou, J. (eds) Web-Age Information Management. WAIM 2013. Lecture Notes in Computer Science, vol 7923. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-38562-9_5

Download citation

DOI: https://doi.org/10.1007/978-3-642-38562-9_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-38561-2
Online ISBN: 978-3-642-38562-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics