Nothing Special   »   [go: up one dir, main page]

Zhao et al., 2023 - Google Patents

Double-Anonymous Sketch: Achieving Top-K-fairness for Finding Global Top-K Frequent Items

Zhao et al., 2023

View PDF
Document ID
10260680745650505775
Author
Zhao Y
Han W
Zhong Z
Zhang Y
Yang T
Cui B
Publication year
Publication venue
Proceedings of the ACM on Management of Data

External Links

Snippet

Finding top-K frequent items has been a hot topic in data stream processing in recent years, which has a wide range of applications. However, most of existing sketch algorithms focuses on finding local top-K in a single data stream. In this paper, we work on finding global top-K …
Continue reading at yindazhang.github.io (PDF) (other versions)

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30286Information retrieval; Database structures therefor; File system structures therefor in structured data stores
    • G06F17/30386Retrieval requests
    • G06F17/30424Query processing
    • G06F17/30533Other types of queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30286Information retrieval; Database structures therefor; File system structures therefor in structured data stores
    • G06F17/30386Retrieval requests
    • G06F17/30424Query processing
    • G06F17/30477Query execution
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/12Replacement control
    • G06F12/121Replacement control using replacement algorithms
    • G06F12/122Replacement control using replacement algorithms of the least frequently used [LFU] type, e.g. with individual count value
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30286Information retrieval; Database structures therefor; File system structures therefor in structured data stores
    • G06F17/30312Storage and indexing structures; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3409Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation; Recording or statistical evaluation of user activity, e.g. usability assessment for performance assessment
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0806Multiuser, multiprocessor or multiprocessing cache systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/3061Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F17/30705Clustering or classification
    • G06F17/3071Clustering or classification including class or cluster creation or modification
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30861Retrieval from the Internet, e.g. browsers
    • G06F17/30864Retrieval from the Internet, e.g. browsers by querying, e.g. search engines or meta-search engines, crawling techniques, push systems
    • G06F17/30867Retrieval from the Internet, e.g. browsers by querying, e.g. search engines or meta-search engines, crawling techniques, push systems with filtering and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F2201/00Indexing scheme relating to error detection, to error correction, and to monitoring
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled

Similar Documents

Publication Publication Date Title
Li et al. Wavingsketch: An unbiased and generic sketch for finding top-k items in data streams
Harmouch et al. Cardinality estimation: An experimental survey
Dimitropoulos et al. Probabilistic lossy counting: An efficient algorithm for finding heavy hitters
Zhong et al. Burstsketch: Finding bursts in data streams
Shin et al. D-cube: Dense-block detection in terabyte-scale tensors
Zhao et al. Double-Anonymous Sketch: Achieving Top-K-fairness for Finding Global Top-K Frequent Items
Zhang et al. On-off sketch: A fast and accurate sketch on persistence
Liu et al. Methods for mining frequent items in data streams: an overview
Chen et al. Mining frequent items in data stream using time fading model
Han et al. Applications of sketches in network traffic measurement: A survey
Chen et al. Out of many we are one: Measuring item batch with clock-sketch
Jia et al. Loglog filter: Filtering cold items within a large range over high speed data streams
Qi et al. Cuckoo counter: A novel framework for accurate per-flow frequency estimation in network measurement
Shi et al. Cuckoo counter: Adaptive structure of counters for accurate frequency and top-k estimation
Wang et al. Joinsketch: A sketch algorithm for accurate and unbiased inner-product estimation
Li et al. Ladderfilter: Filtering infrequent items with small memory and time overhead
Zhang et al. CAFE: Towards Compact, Adaptive, and Fast Embedding for Large-scale Recommendation Models
Basat et al. Efficient summing over sliding windows
Cheng et al. LTC: a fast algorithm to accurately find significant items in data streams
Wu et al. MicroscopeSketch: Accurate Sliding Estimation Using Adaptive Zooming
Liu et al. XY-sketch: On sketching data streams at web scale
Li et al. Tight-sketch: A high-performance sketch for heavy item-oriented data stream mining with limited memory size
Dutta et al. Towards" intelligent compression" in streams: a biased reservoir sampling based bloom filter approach
Wang et al. Real-time Spread Burst Detection in Data Streaming
Xiao et al. Accurate and o (1)-time query of per-flow cardinality in high-speed networks