Abstract
Data streams are unbounded and infinite flows of data arriving at high rates which cannot be stored for offline processing. Because of this, classical approaches for Data Mining cannot be used straightforwardly in data stream scenario. This paper introduces a single-pass hardware-based algorithm for frequent itemsets mining on data streams that uses the top-k frequent 1-itemsets. Experimental results of the hardware implementation of the proposed algorithm are also presented and discussed.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Dataset is referred to databases, unstructured data file, relational databases or any other data source. In this paper, dataset is used to refer data streams.
References
Agrawal, R., Srikant, R.: Fast algorithms for mining association rules in large databases. In: Proceedings of the 20th International Conference on Very Large Data Bases, VLDB 1994, pp. 487–499. Morgan Kaufmann Publishers Inc, San Francisco (1994)
Babcock, B., Babu, S., Datar, M., Motwani, R., Widom, J.: Models and issues in data stream systems. In: Proceedings of the Twenty-first ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, PODS 2002, pp. 1–16. ACM, New York (2002)
Baker, Z.K., Prasanna, V.K.: Efficient hardware data mining with the Apriori algorithm on FPGAs. In: Proceedings of the 13th Annual IEEE Symposium on Field-Programmable Custom Computing Machines, FCCM 2005, pp. 3–12. IEEE Computer Society, Washington (2005)
Baker, Z.K., Prasanna, V.K.: An architecture for efficient hardware data mining using reconfigurable computing systems. In: 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines, 2006, FCCM 2006, pp. 67–75, April 2006
Baralis, E., Cerquitelli, T., Chiusano, S., Grand, A., Grimaudo, L.: An efficient itemset mining approach for data streams. In: König, A., Dengel, A., Hinkelmann, K., Kise, K., Howlett, R.J., Jain, L.C. (eds.) KES 2011, Part II. LNCS, vol. 6882, pp. 515–523. Springer, Heidelberg (2011)
Borgelt, C.: Software for frequent pattern mining. http://www.borgelt.net/fpm.html. Accessed on 20 May 2015
Borgelt, C.: Efficient implementations of Apriori and Eclat. In: FIMI. CEUR Workshop Proc., vol. 90. CEUR-WS.org (2003)
Borgelt, C.: An implementation of the FP-growth algorithm. In: Proceedings of the 1st International Workshop on Open Source Data Mining: Frequent Pattern Mining Implementations, OSDM 2005, pp. 1–5. ACM (2005)
Cheng, J., Ke, Y., Ng, W.: A survey on algorithms for mining frequent itemsets over data streams. Knowl. Inf. Syst. 16(1), 1–27 (2008)
Compton, K., Hauck, S.: Reconfigurable computing: A survey of systems and software. ACM Comput. Surv. 34(2), 171–210 (2002)
Cormode, G., Korn, F., Muthukrishnan, S., Srivastava, D.: Finding hierarchical heavy hitters in data streams. In: Proceedings of the 29th International Conference on Very Large Data Bases, VLDB 2003, vol. 29, pp. 464–475. VLDB Endowment (2003)
Corporation, I.B.M.: IBM quest market-basket synthetic data generator. http://www.cs.loyola.edu/cgiannel/assoc_gen.html. Accessed on 20 April 2015
Golab, L., Ozsu, M.: Data stream management issues - a survey. SIGMOD Rec. 32(2), 5–14 (2003)
Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, SIGMOD 2000, pp. 1–12. ACM, New York (2000)
Hulten, G., Spencer, L., Domingos, P.: Mining time-changing data streams. In: Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2001, pp. 97–106. ACM, New York (2001)
Jiang, N., Gruenwald, L.: Research issues in data stream association rule mining. SIGMOD Rec. 35(1), 14–19 (2006)
Jin, R., Agrawal, G.: Frequent pattern mining in data streams. In: Aggarwal, C. (ed.) Data Streams. Advances in Database Systems, vol. 31, pp. 61–84. Springer, US (2007)
Lai, Y.K., Wang, N.C., Chou, T.Y., Lee, C.C., Wellem, T., Nugroho, H.T.: Implementing on-line sketch-based change detection on a NetFPGA platform. In: 1st Asia NetFPGA Developers Workshop (2010)
Lee, W., Stolfo, S.J., Mok, K.W.: Adaptive intrusion detection: A data mining approach. Artif. Intell. Rev. 14(6), 533–567 (2000)
Thanh, L.H., Calders, T.: Mining top-k frequent items in a data stream with flexible sliding windows. In: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2010, pp. 283–292. ACM, New York (2010)
Lichman, M.: UCI machine learning repository (2013). Accessed on 20 April 2015
Malarvizhi, S.P., Sathiyabhama, B.: Enhanced reconfigurable weighted association rule mining for frequent patterns of web logs. Int. J. Comput. 13(2), 97–105 (2014)
Manku, G.S., Motwani, R.: Approximate frequency counts over data streams. In: Proceedings of the 28th International Conference on Very Large Data Bases, VLDB 2002, pp. 346–357. VLDB Endowment (2002)
Mesa, A., Feregrino-Uribe, C., Cumplido, R., Hernández-Palancar, J.: A highly parallel algorithm for frequent itemset mining. In: Martínez-Trinidad, J.F., Carrasco-Ochoa, J.A., Kittler, J. (eds.) MCPR 2010. LNCS, vol. 6256, pp. 291–300. Springer, Heidelberg (2010)
Metwally, A., Agrawal, D.P., El Abbadi, A.: Efficient computation of frequent and top-k elements in data streams. In: Eiter, T., Libkin, L. (eds.) ICDT 2005. LNCS, vol. 3363, pp. 398–412. Springer, Heidelberg (2005)
Metwally, A., Agrawal, D., Abbadi, A.E.: An integrated efficient solution for computing frequent and top-k elements in data streams. ACM Trans. Database Syst. 31(3), 1095–1133 (2006)
Pramod, S., Vyas, O.P.: Data stream mining: A review on windowing approach. Global J. Comput. Sci. Technol. 12(11-C) (2012)
Shi, S., Qi, Y., Wang, Q.: Accelerating intersection computation in frequent itemset mining with FPGA. In: 2013 IEEE 10th International Conference on High Performance Computing and Communications 2013, pp. 659–665, November 2013
Shie, B.E., Yu, P.S., Tseng, V.S.: Efficient algorithms for mining maximal high utility itemsets from data streams with different models. Expert Syst. Appl. 39(17), 12947–12960 (2012)
Sun, S., Steffen, M., Zambreno, J.: A reconfigurable platform for frequent pattern mining. In: International Conference on Reconfigurable Computing and FPGAs, 2008, ReConFig 2008, pp. 55–60, December 2008
Sun, S., Zambreno, J.: Mining association rules with systolic trees. In: International Conference on Field Programmable Logic and Applications, 2008, FPL 2008, pp. 143–148, September 2008
Sun, S., Zambreno, J.: Design and analysis of a reconfigurable platform for frequent pattern mining. IEEE Trans. Parallel Distrib. Syst. 22(9), 1497–1505 (2011)
Thoni, D.W., Strey, A.: Novel strategies for hardware acceleration of frequent itemset mining with the Apriori algorithm. In: International Conference on Field Programmable Logic and Applications, 2009, FPL 2009, pp. 489–492, August 2009
Tong, D., Prasanna, V.: Online heavy hitter detector on FPGA. In: 2013 International Conference on Reconfigurable Computing and FPGAs (ReConFig), pp. 1–6. IEEE (2013)
Wang, H., Fan, W., Yu, P.S., Han, J.: Mining concept-drifting data streams using ensemble classifiers. In: Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2003, pp. 226–235. ACM, New York (2003)
Wen, Y.H., Huang, J.W., Chen, M.S.: Hardware-enhanced association rule mining with hashing and pipelining. IEEE Trans. Knowl. Data Eng. 20(6), 784–795 (2008)
Zaki, M.J.: Scalable algorithms for association mining. IEEE Trans. Knowl. Data Eng. 12(3), 372–390 (2000)
Zhang, Y., Singh, S., Sen, S., Duffield, N., Lund, C.: Online identification of hierarchical heavy hitters: Algorithms, evaluation, and applications. In: Proceedings of the 4th ACM SIGCOMM Conference on Internet Measurement, IMC 2004, pp. 101–114. ACM, New York (2004)
Zhang, Y., Zhang, F., Jin, Z., Bakos, J.D.: An FPGA-based accelerator for frequent itemset mining. ACM Trans. Reconfigurable Technol. Syst. 6(1), 2:1–2:17 (2013)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Bustio, L., Cumplido, R., Hernández, R., Bande, J.M., Feregrino, C. (2016). Frequent Itemsets Mining in Data Streams Using Reconfigurable Hardware. In: Ceci, M., Loglisci, C., Manco, G., Masciari, E., Ras, Z. (eds) New Frontiers in Mining Complex Patterns. NFMCP 2015. Lecture Notes in Computer Science(), vol 9607. Springer, Cham. https://doi.org/10.1007/978-3-319-39315-5_3
Download citation
DOI: https://doi.org/10.1007/978-3-319-39315-5_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-39314-8
Online ISBN: 978-3-319-39315-5
eBook Packages: Computer ScienceComputer Science (R0)