Nothing Special   »   [go: up one dir, main page]

skip to main content
article

Dynamic count filters

Published: 01 March 2006 Publication History

Abstract

Bloom filters are not able to handle deletes and inserts on multisets over time. This is important in many situations when streamed data evolve rapidly and change patterns frequently. Counting Bloom Filters (CBF) have been proposed to overcome this limitation and allow for the dynamic evolution of Bloom filters. The only dynamic approach to a compact and efficient representation of CBF are the Spectral Bloom Filters (SBF).In this paper we propose the Dynamic Count Filters (DCF) as a new dynamic and space-time efficient representation of CBF. Although DCF does not make a compact use of memory, it shows to be faster and more space efficient than any previous proposal. Results show that the proposed data structure is more efficient independently of the incoming data characteristics.

References

[1]
Burton H. Bloom. Space/Time Trade-offs in Hash Coding with Allowable Errors. Communications of the ACM, 13(7):422--426, 1970.
[2]
L. Breslau, P. Cao, L. Fan, G. Phillips, and S. Shenker. Web caching and zipf-like distributions: Evidence and implications. In Proceedings of the IEEE Infocom Conference, 1999.
[3]
Andrei Broder and Michael Mitzenmacher. Network Applications of Bloom Filters: A survey. A survey. In Proc. of Allerton Conference, 2002.
[4]
Saar Cohen and Yossi Matias. Spectral bloom filters. SIGMOD '03: Proceedings of the 2003 ACM SIGMOD international conference on Management of data, pages 241--252, 2003.
[5]
Sarang Dharmapurikar, Praveen Krishnamurthy, and David E. Taylor. Longest prefix matching using bloom filters. SIGCOMM '03: Proceedings of the 2003 conference on Applications, technologies, architectures, and protocols for computer communications, pages 201--212, 2003.
[6]
L. Fan, P. Cao, J. Almeida, and A. Z. Broder. Summary cache: a scalable wide-area web cache sharing protocol. IEEE Trans on Networking, 8(3):281--293, 2000.
[7]
Min Fang, Narayanan Shivakumar, Hector Garcia-Molina, Rajeev Motwani, and Jeffrey D. Ullman. Computing Iceberg Queries Efficiently. VLDB '98: Proceedings of the 24rd International Conference on Very Large Data Bases, pages 299--310, 1998.
[8]
Jonathan Ledlie, Laura Serban, and Dafina Toncheva. Scaling filename queries in a large-scale distributed file systems. Research Report TR-03-02, Harvard University, January 2002.
[9]
G. Manku and R. Motwani. Approximate frequency counts over data streams. In Proceedings of the 28th International Conference on Very Large Data Bases, 2002.
[10]
James K. Mullin. A second look at bloom filters. Commun. ACM, 26(8):570--571, 1983.
[11]
M. V. Ramakrishna. Practical performance of Bloom filters and parallel free-text searching. Commun. ACM, 32(10):1237--1239, 1989.
[12]
Wei Wang, Haifeng Jiang, Hongjun Lu, and Jeffrey Xu Yu. Bloom histogram: Path selectivity estimation for xml data with updates. VLDB'04: Proceedings of the Thirtieth International Conference on Very Large Data Bases, pages 240--251, 2004.

Cited By

View all
  • (2024)Ring Sketch:A Generic, Low-Complexity, and Hardware-Friendly Traffic Measurement Framework over Sliding WindowsICC 2024 - IEEE International Conference on Communications10.1109/ICC51166.2024.10622576(1102-1108)Online publication date: 9-Jun-2024
  • (2023)A Sketch Framework for Approximate Data Stream Processing in Sliding WindowsIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2022.315114035:5(4411-4424)Online publication date: 1-May-2023
  • (2022)Pyramid Family: Generic Frameworks for Accurate and Fast Flow Size MeasurementIEEE/ACM Transactions on Networking10.1109/TNET.2021.312008530:2(586-600)Online publication date: Apr-2022
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM SIGMOD Record
ACM SIGMOD Record  Volume 35, Issue 1
March 2006
71 pages
ISSN:0163-5808
DOI:10.1145/1121995
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 March 2006
Published in SIGMOD Volume 35, Issue 1

Check for updates

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)10
  • Downloads (Last 6 weeks)0
Reflects downloads up to 19 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Ring Sketch:A Generic, Low-Complexity, and Hardware-Friendly Traffic Measurement Framework over Sliding WindowsICC 2024 - IEEE International Conference on Communications10.1109/ICC51166.2024.10622576(1102-1108)Online publication date: 9-Jun-2024
  • (2023)A Sketch Framework for Approximate Data Stream Processing in Sliding WindowsIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2022.315114035:5(4411-4424)Online publication date: 1-May-2023
  • (2022)Pyramid Family: Generic Frameworks for Accurate and Fast Flow Size MeasurementIEEE/ACM Transactions on Networking10.1109/TNET.2021.312008530:2(586-600)Online publication date: Apr-2022
  • (2021)HSS: Faster and More Accurate Sliding Sketch by Using Half Fields2021 IEEE 23rd Int Conf on High Performance Computing & Communications; 7th Int Conf on Data Science & Systems; 19th Int Conf on Smart City; 7th Int Conf on Dependability in Sensor, Cloud & Big Data Systems & Application (HPCC/DSS/SmartCity/DependSys)10.1109/HPCC-DSS-SmartCity-DependSys53884.2021.00082(439-447)Online publication date: Dec-2021
  • (2020)Sliding Sketches: A Framework using Time Zones for Data Stream Processing in Sliding WindowsProceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining10.1145/3394486.3403144(1015-1025)Online publication date: 23-Aug-2020
  • (2020)SF-Sketch: A Two-Stage Sketch for Data StreamsIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2020.298760931:10(2263-2276)Online publication date: 1-Oct-2020
  • (2020)SA Sketch: A self‐adaption sketch framework for high‐speed networkConcurrency and Computation: Practice and Experience10.1002/cpe.589132:23Online publication date: 16-Jun-2020
  • (2019)Diamond Sketch: Accurate Per-flow Measurement for Big Streaming DataIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2019.292377230:12(2650-2662)Online publication date: 1-Dec-2019
  • (2019)Bloofi Representation for Item/User in Recommender Systems2019 5th International Conference on Web Research (ICWR)10.1109/ICWR.2019.8765261(67-73)Online publication date: Apr-2019
  • (2019)Threshold-Based Widespread Event Detection2019 IEEE 39th International Conference on Distributed Computing Systems (ICDCS)10.1109/ICDCS.2019.00047(399-408)Online publication date: Jul-2019
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media