Abstract
False-negative frequent items mining from a high speed transactional data stream is to find an approximate set of frequent items with respect to a minimum support threshold, s. It controls the possibility of missing frequent items using a reliability parameter δ. The importance of false-negative frequent items mining is that it can exclude false-positives and therefore significantly reduce the memory consumption for frequent itemsets mining. The key issue of false-negative frequent items mining is how to minimize the possibility of missing frequent items. In this paper, we propose a new false-negative frequent items mining algorithm, called Loss-Negative, for handling bursting in data streams. The new algorithm consumes the smallest memory in comparison with other false-negative and false-positive frequent items algorithms. We present theoretical bound of the new algorithm, and analyze the possibility of minimization of missing frequent items, in terms of two possibilities, namely, in-possibility and out-possibility. The former is about how a frequent item can possibly pass the first pruning. The latter is about how long a frequent item can stay in memory while no occurrences of the item comes in the following data stream for a certain period. The new proposed algorithm is superior to the existing false-negative frequent items mining algorithms in terms of the two possibilities. We demonstrate the effectiveness of the new algorithm in this paper.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Charikar, M., Chen, K., Farach-Colton, M.: Finding frequent items in data streams. In: Proc. of the 29th ICALP (2002)
Cormode, G., Muthukrishnan, S.: What’s hot and what’s not: Tracking most frequent items dynamically. In: Proc. of PODS 2003 (2003)
Demaine, E., López-Ortiz, A., Munro, J.I.: Frequency estimation of internet packet streams with limited space. In: Proc. of 10th Annual European Symposium on Algorithms (2002)
Manku, G.S., Motwani, R.: Approximate frequency counts over data streams. In: Proc. of VLDB 2002 (2002)
Karp, S.S.R.M., Papadimitrlou, C.H.: A simple algorithm for finding frequent elements in streams and bags. ACM Transactions on Database Systems 28, 51–55 (2003)
Yu, J.X., Chong, Z., Lu, H., Zhou, A.: False positive or false negative: Mining frequent itemsets from high speed transactional data streams. In: Proc. of VLDB 2004 (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Chong, Z., Yu, J.X., Lu, H., Zhang, Z., Zhou, A. (2005). False-Negative Frequent Items Mining from Data Streams with Bursting. In: Zhou, L., Ooi, B.C., Meng, X. (eds) Database Systems for Advanced Applications. DASFAA 2005. Lecture Notes in Computer Science, vol 3453. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11408079_38
Download citation
DOI: https://doi.org/10.1007/11408079_38
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-25334-1
Online ISBN: 978-3-540-32005-0
eBook Packages: Computer ScienceComputer Science (R0)