research-article

Public Access

Sketching Linear Classifiers over Data Streams

Authors:

Gregory ValiantAuthors Info & Claims

SIGMOD '18: Proceedings of the 2018 International Conference on Management of Data

Pages 757 - 772

https://doi.org/10.1145/3183713.3196930

Published: 27 May 2018 Publication History

Abstract

We introduce a new sub-linear space sketch---the Weight-Median Sketch---for learning compressed linear classifiers over data streams while supporting the efficient recovery of large-magnitude weights in the model. This enables memory-limited execution of several statistical analyses over streams, including online feature selection, streaming data explanation, relative deltoid detection, and streaming estimation of pointwise mutual information. Unlike related sketches that capture the most frequently-occurring features (or items) in a data stream, the Weight-Median Sketch captures the features that are most discriminative of one stream (or class) compared to another. The Weight-Median Sketch adopts the core data structure used in the Count-Sketch, but, instead of sketching counts, it captures sketched gradient updates to the model parameters. We provide a theoretical analysis that establishes recovery guarantees for batch and online learning, and demonstrate empirical improvements in memory-accuracy trade-offs over alternative memory-budgeted methods, including count-based sketches and feature hashing.

References

[1]

Dimitris Achlioptas. 2003. Database-friendly random projections: Johnson-Lindenstrauss with binary coins. Journal of computer and System Sciences Vol. 66, 4 (2003), 671--687.

Digital Library

[2]

Jimmy Ba and Rich Caruana. 2014. Do deep nets really need to be deep?. In Advances in neural information processing systems. 2654--2662.

Digital Library

[3]

Peter Bailis, Edward Gan, Samuel Madden, Deepak Narayanan, Kexin Rong, and Sahaana Suri. 2017. Macrobase: Prioritizing attention in fast data. In Proceedings of the 2017 ACM International Conference on Management of Data. ACM, 541--556.

Digital Library

[4]

Nagender Bandi, Ahmed Metwally, Divyakant Agrawal, and Amr El Abbadi. 2007. Fast data stream algorithms using associative memories Proceedings of the 2007 ACM SIGMOD international conference on Management of data. ACM, 247--256.

Digital Library

[5]

Avrim Blum, Adam Kalai, and John Langford. 1999. Beating the hold-out: Bounds for k-fold and progressive cross-validation Proceedings of the twelfth annual conference on Computational learning theory. ACM, 203--208.

Digital Library

[6]

Oscar Boykin, Sam Ritchie, Ian O'Connell, and Jimmy Lin. 2014. Summingbird: A framework for integrating batch and online mapreduce computations. Proceedings of the VLDB Endowment Vol. 7, 13 (2014), 1441--1451.

Digital Library

[7]

Daniela Brauckhoff, Xenofontas Dimitropoulos, Arno Wagner, and Kavè Salamatian. 2012. Anomaly extraction in backbone networks using association rules. IEEE/ACM Transactions on Networking (TON) Vol. 20, 6 (2012), 1788--1799.

Digital Library

[8]

Cristian Buciluv a, Rich Caruana, and Alexandru Niculescu-Mizil. 2006. Model compression Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 535--541.

Digital Library

[9]

Robert Calderbank, Sina Jafarpour, and Robert Schapire . {n. d.}. Compressed Learning: Universal Sparse Dimensionality Reduction and Learning in the Measurement Domain. (.{n. d.}).

[10]

Moses Charikar, Kevin Chen, and Martin Farach-Colton. 2002. Finding frequent items in data streams. Automata, languages and programming (2002), 784--784.

Digital Library

[11]

Ciprian Chelba, Tomas Mikolov, Mike Schuster, Qi Ge, Thorsten Brants, Phillipp Koehn, and Tony Robinson. 2013. One billion word benchmark for measuring progress in statistical language modeling. arXiv preprint arXiv:1312.3005 (2013).

[12]

Jeffrey Considine, Feifei Li, George Kollios, and John Byers. 2004. Approximate aggregation techniques for sensor databases Data Engineering, 2004. Proceedings. 20th International Conference on. IEEE, 449--460.

Digital Library

[13]

Sam Corbett-Davies, Emma Pierson, Avi Feller, Sharad Goel, and Aziz Huq. 2017. Algorithmic decision making and the cost of fairness. arXiv preprint arXiv:1701.08230 (2017).

Digital Library

[14]

Graham Cormode and Marios Hadjieleftheriou. 2008. Finding frequent items in data streams. Proceedings of the VLDB Endowment Vol. 1, 2 (2008), 1530--1541.

Digital Library

[15]

Graham Cormode and Shan Muthukrishnan. 2005 a. An improved data stream summary: the count-min sketch and its applications. Journal of Algorithms Vol. 55, 1 (2005), 58--75.

Digital Library

[16]

Graham Cormode and S Muthukrishnan. 2005 b. What's new: Finding significant differences in network data streams. IEEE/ACM Transactions on Networking (TON) Vol. 13, 6 (2005), 1219--1232.

Digital Library

[17]

Koby Crammer, Jaz Kandola, and Yoram Singer. 2004. Online classification on a budget. In Advances in neural information processing systems. 225--232.

Digital Library

[18]

Alberto Dainotti, Antonio Pescape, and Kimberly C Claffy. 2012. Issues and future directions in traffic classification. IEEE network, Vol. 26, 1 (2012).

Digital Library

[19]

Ofer Dekel, Shai Shalev-Shwartz, and Yoram Singer. 2006. The Forgetron: A kernel-based perceptron on a fixed budget Advances in neural information processing systems. 259--266.

Digital Library

[20]

Erik D Demaine, Alejandro López-Ortiz, and J Ian Munro. 2002. Frequency estimation of internet packet streams with limited space European Symposium on Algorithms. Springer, 348--360.

Digital Library

[21]

John Duchi and Yoram Singer. 2009. Efficient online and batch learning using forward backward splitting. Journal of Machine Learning Research Vol. 10, Dec (2009), 2899--2934.

Digital Library

[22]

Benjamin V Durme and Ashwin Lall. 2009. Streaming pointwise mutual information. In Advances in Neural Information Processing Systems. 1892--1900.

Digital Library

[23]

Pavlos S Efraimidis and Paul G Spirakis. 2006. Weighted random sampling with a reservoir. Inform. Process. Lett. Vol. 97, 5 (2006), 181--185.

Digital Library

[24]

Philippe Flajolet. 1985. Approximate counting: a detailed analysis. BIT Numerical Mathematics Vol. 25, 1 (1985), 113--134.

Digital Library

[25]

Daniel Golovin, D Sculley, Brendan McMahan, and Michael Young. 2013. Large-scale learning with less ram via randomization Proceedings of the 30th International Conference on Machine Learning (ICML-13). 325--333.

Digital Library

[26]

Michael Greenwald and Sanjeev Khanna. 2001. Space-efficient online computation of quantile summaries ACM SIGMOD Record, Vol. Vol. 30. ACM, 58--66.

Digital Library

[27]

Chirag Gupta, Arun Sai Suggala, Ankit Goyal, Harsha Vardhan Simhadri, Bhargavi Paranjape, Ashish Kumar, Saurabh Goyal, Raghavendra Udupa, Manik Varma, and Prateek Jain. 2017. ProtoNN: Compressed and Accurate kNN for Resource-scarce Devices International Conference on Machine Learning. 1331--1340.

[28]

Michael Gutmann and Aapo Hyvämodels Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics. 297--304.

[29]

Elad Hazan et almbox. 2016. Introduction to online convex optimization. Foundations and Trends® in Optimization, Vol. 2, 3--4 (2016), 157--325.

Digital Library

[30]

Elad Hazan, Amit Agarwal, and Satyen Kale. 2007. Logarithmic regret algorithms for online convex optimization. Machine Learning, Vol. 69, 2 (2007), 169--192.

Digital Library

[31]

Geoffrey Hinton, Oriol Vinyals, and Jeff Dean. 2015. Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 (2015).

[32]

Steven CH Hoi, Jialei Wang, Peilin Zhao, and Rong Jin. 2012. Online feature selection for mining big data. In Proceedings of the 1st international workshop on big data, streams and heterogeneous source mining: Algorithms, systems, programming models and applications. ACM, 93--100.

Digital Library

[33]

William B Johnson and Joram Lindenstrauss. 1984. Extensions of Lipschitz mappings into a Hilbert space. Contemporary mathematics Vol. 26, 189--206 (1984), 1.

[34]

Nobuhiro Kaji and Hayato Kobayashi. 2017. Incremental skip-gram model with negative sampling. arXiv preprint arXiv:1704.03956 (2017).

[35]

Daniel M Kane and Jelani Nelson. 2014. Sparser Johnson-Lindenstrauss transforms. Journal of the ACM (JACM) Vol. 61, 1 (2014), 4.

Digital Library

[36]

Ashish Kapoor, Simon Baker, Sumit Basu, and Eric Horvitz. 2012. Memory constrained face recognition. In Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on. IEEE, 2539--2546.

Digital Library

[37]

Richard M Karp, Scott Shenker, and Christos H Papadimitriou. 2003. A simple algorithm for finding frequent elements in streams and bags. ACM Transactions on Database Systems (TODS), Vol. 28, 1 (2003), 51--55.

Digital Library

[38]

Jakub Konevcnỳ, Brendan McMahan, and Daniel Ramage. 2015. Federated optimization: Distributed optimization beyond the datacenter. arXiv preprint arXiv:1511.03575 (2015).

[39]

Ashish Kumar, Saurabh Goyal, and Manik Varma. 2017. Resource-efficient Machine Learning in 2 KB RAM for the Internet of Things International Conference on Machine Learning. 1935--1944.

[40]

John Langford, Lihong Li, and Tong Zhang. 2009. Sparse online learning via truncated gradient. Journal of Machine Learning Research Vol. 10, Mar (2009), 777--801.

Digital Library

[41]

Kasper Green Larsen, Jelani Nelson, Huy L Nguyên, and Mikkel Thorup. 2016. Heavy hitters via cluster-preserving clustering. Foundations of Computer Science (FOCS), 2016 IEEE 57th Annual Symposium on. IEEE, 61--70.

[42]

Omer Levy and Yoav Goldberg. 2014. Neural word embedding as implicit matrix factorization Advances in neural information processing systems. 2177--2185.

Digital Library

[43]

David D Lewis, Yiming Yang, Tony G Rose, and Fan Li. 2004. RCV1: A new benchmark collection for text categorization research. Journal of machine learning research Vol. 5, Apr (2004), 361--397.

Digital Library

[44]

Brent Longstaff, Sasank Reddy, and Deborah Estrin. 2010. Improving activity classification for health applications on mobile devices using active and semi-supervised learning. In International Conference on Pervasive Computing Technologies for Healthcare (PervasiveHealth). IEEE.

[45]

Ge Luo, Lu Wang, Ke Yi, and Graham Cormode. 2016. Quantiles over data streams: experimental comparisons, new analyses, and further improvements. The VLDB Journal, Vol. 25, 4 (2016), 449--472.

Digital Library

[46]

Justin Ma, Lawrence K Saul, Stefan Savage, and Geoffrey M Voelker. 2009. Identifying suspicious URLs: an application of large-scale online learning Proceedings of the 26th annual international conference on machine learning. ACM, 681--688.

Digital Library

[47]

Gurmeet Singh Manku and Rajeev Motwani. 2002. Approximate frequency counts over data streams. In Proceedings of the 28th international conference on Very Large Data Bases. VLDB Endowment, 346--357.

Digital Library

[48]

Chandler May, Kevin Duh, Benjamin Van Durme, and Ashwin Lall. 2017. Streaming Word Embeddings with the Space-Saving Algorithm. arXiv preprint arXiv:1704.07463 (2017).

[49]

Ian McGraw, Rohit Prabhavalkar, Raziel Alvarez, Montse Gonzalez Arenas, Kanishka Rao, David Rybach, Ouais Alsharif, Hacsim Sak, Alexander Gruenstein, Franccoise Beaufays, et almbox. 2016. Personalized speech recognition on mobile devices. Acoustics, Speech and Signal Processing (ICASSP), 2016 IEEE International Conference on. IEEE, 5955--5959.

Digital Library

[50]

H Brendan McMahan, Gary Holt, David Sculley, Michael Young, Dietmar Ebner, Julian Grady, Lan Nie, Todd Phillips, Eugene Davydov, Daniel Golovin, et almbox. 2013. Ad click prediction: a view from the trenches. In Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 1222--1230.

Digital Library

[51]

Alexandra Meliou, Sudeepa Roy, and Dan Suciu. 2014. Causality and explanations in databases. In VLDB.

Digital Library

[52]

Ahmed Metwally, Divyakant Agrawal, and Amr El Abbadi. 2005. Efficient computation of frequent and top-k elements in data streams International Conference on Database Theory. Springer, 398--412.

Digital Library

[53]

Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013).

[54]

Katsiaryna Mirylenka, Graham Cormode, Themis Palpanas, and Divesh Srivastava . 2015. Conditional heavy hitters: detecting interesting correlations in data streams. The VLDB Journal, Vol. 24, 3 (2015), 395--414.

Digital Library

[55]

Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. Why should i trust you?: Explaining the predictions of any classifier Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 1135--1144.

Digital Library

[56]

Pratanu Roy, Arijit Khan, and Gustavo Alonso. 2016. Augmented sketch: Faster and more accurate stream processing Proceedings of the 2016 International Conference on Management of Data. ACM, 1449--1463.

Digital Library

[57]

Robert Schweller, Ashish Gupta, Elliot Parsons, and Yan Chen. 2004. Reversible sketches for efficient and accurate change detection over network data streams Proceedings of the 4th ACM SIGCOMM conference on Internet measurement. ACM, 207--212.

Digital Library

[58]

Shai Shalev-Shwartz, Yoram Singer, Nathan Srebro, and Andrew Cotter . 2011. Pegasos: Primal estimated sub-gradient solver for svm. Mathematical programming Vol. 127, 1 (2011), 3--30.

Digital Library

[59]

Ohad Shamir. 2016. Without-Replacement Sampling for Stochastic Gradient Methods. Advances in Neural Information Processing Systems 29, bibfieldeditorD. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, and R. Garnett (Eds.). Curran Associates, Inc., 46--54. http://papers.nips.cc/paper/6245-without-replacement-sampling-for-stochastic-gradient-methods.pdf

Digital Library

[60]

Qinfeng Shi, James Petterson, Gideon Dror, John Langford, Alexander L Strehl, Alex J Smola, and SVN Vishwanathan. 2009. Hash kernels International Conference on Artificial Intelligence and Statistics. 496--503.

[61]

Nisheeth Shrivastava, Chiranjeeb Buragohain, Divyakant Agrawal, and Subhash Suri . 2004. Medians and beyond: new aggregation techniques for sensor networks Proceedings of the 2nd international conference on Embedded networked sensor systems. ACM, 239--249.

Digital Library

[62]

Virginia Smith, Chao-Kai Chiang, Maziar Sanjabi, and Ameet S Talwalkar. 2017. Federated Multi-Task Learning. In Advances in Neural Information Processing Systems. 4427--4437.

[63]

J. Stamper, A. Niculescu-Mizil, S. Ritter, G.J. Gordon, and K.R. Koedinger. 2010. Algebra I 2008--2009. Challenge data set from KDD Cup 2010 Educational Data Mining Challenge. (2010). Find it at http://pslcdatashop.web.cmu.edu/KDDCup/downloads.jsp.

[64]

Jacob Steinhardt and John Duchi. 2015. Minimax rates for memory-bounded sparse linear regression Conference on Learning Theory. 1564--1587.

[65]

Peter D Turney and Patrick Pantel. 2010. From frequency to meaning: Vector space models of semantics. Journal of artificial intelligence research Vol. 37 (2010), 141--188.

[66]

CAIDA UCSD. 2008. The CAIDA UCSD Anonymized Passive OC48 Internet Traces Dataset. (2008). http://www.caida.org/data/passive/passive_oc48_dataset.xml

[67]

Balajee Vamanan, Gwendolyn Voskuilen, and TN Vijaykumar. 2010. EffiCuts: optimizing packet classification for memory and throughput ACM SIGCOMM Computer Communication Review, Vol. Vol. 40. ACM, 207--218.

Digital Library

[68]

Shoba Venkataraman, Dawn Song, Phillip B Gibbons, and Avrim Blum. 2005. New streaming algorithms for fast detection of superspreaders. Department of Electrical and Computing Engineering (2005), 6.

[69]

Kilian Weinberger, Anirban Dasgupta, John Langford, Alex Smola, and Josh Attenberg. 2009. Feature hashing for large scale multitask learning Proceedings of the 26th Annual International Conference on Machine Learning. ACM, 1113--1120.

Digital Library

[70]

Eugene Wu and Samuel Madden. 2013. Scorpion: Explaining away outliers in aggregate queries. Proceedings of the VLDB Endowment Vol. 6, 8 (2013), 553--564.

Digital Library

[71]

Lin Xiao. 2010. Dual averaging methods for regularized stochastic learning and online optimization. Journal of Machine Learning Research Vol. 11, Oct (2010), 2543--2596.

Digital Library

[72]

Tianbao Yang, Lijun Zhang, Rong Jin, and Shenghuo Zhu. 2015. Theory of dual-sparse regularized randomized reduction International Conference on Machine Learning. 305--314.

Digital Library

[73]

Hsiang-Fu Yu, Hung-Yi Lo, Hsun-Ping Hsieh, Jing-Kai Lou, Todd G McKenzie, Jung-Wei Chou, Po-Han Chung, Chia-Hua Ho, Chun-Fu Chang, Yin-Hsuan Wei, et almbox. 2010. Feature engineering and classifier ensemble for KDD cup 2010 KDD Cup.

[74]

Minlan Yu, Lavanya Jose, and Rui Miao. 2013. Software Defined Traffic Measurement with OpenSketch. NSDI, Vol. Vol. 13. 29--42.

Digital Library

[75]

Tong Yu, Yong Zhuang, Ole J Mengshoel, and Osman Yagan. 2016. Hybridizing personal and impersonal machine learning models for activity recognition on mobile devices.

[76]

Ce Zhang, Arun Kumar, and Christopher Ré. 2016. Materialization optimizations for feature selection workloads. ACM Transactions on Database Systems (TODS), Vol. 41, 1 (2016), 2.

Digital Library

[77]

Lijun Zhang, Mehrdad Mahdavi, Rong Jin, Tianbao Yang, and Shenghuo Zhu. 2014. Random projections for classification: A recovery approach. IEEE Transactions on Information Theory Vol. 60, 11 (2014), 7300--7316.

[78]

Martin Zinkevich. 2003. Online convex programming and generalized infinitesimal gradient ascent Proceedings of the 20th International Conference on Machine Learning (ICML-03). 928--936.

Digital Library

[79]

Hui Zou and Trevor Hastie. 2005. Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society: Series B (Statistical Methodology), Vol. 67, 2 (2005), 301--320.

Cited By

An SCao Y(2024)Relative Keys: Putting Feature Explanation into ContextProceedings of the ACM on Management of Data10.1145/36392632:1(1-28)Online publication date: 26-Mar-2024
https://doi.org/10.1145/3639263
Cao YFeng YWang HXie XZhou S(2024)Learning to Sketch: A Neural Approach to Item Frequency Estimation in Streaming DataIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2024.338858946:11(7136-7153)Online publication date: Nov-2024
https://doi.org/10.1109/TPAMI.2024.3388589
Wu YJiang SXu YDong SYang KChen PYang T(2024)Unbiased Real-Time Traffic SketchingIEEE Transactions on Network Science and Engineering10.1109/TNSE.2023.328400411:3(2371-2383)Online publication date: May-2024
https://doi.org/10.1109/TNSE.2023.3284004
Show More Cited By

Index Terms

Sketching Linear Classifiers over Data Streams
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Supervised learning
        Supervised learning by classification
    2. Machine learning approaches
      1. Learning linear models
2. Theory of computation
  1. Design and analysis of algorithms
    1. Online algorithms
      1. Online learning algorithms
    2. Streaming, sublinear and near linear time algorithms
      1. Sketching and sampling

Recommendations

Sketching user experiences tutorial: stories, strategies, surfaces
ITS '13: Proceedings of the 2013 ACM international conference on Interactive tabletops and surfaces

Paper-pencil sketches are a valuable tool during different stages of experience design in human-computer interaction. This hands-on tutorial will demonstrate how to integrate sketching into researchers' and interaction designers' everyday practice -- ...
Hands-on sketching course
CHI EA '14: CHI '14 Extended Abstracts on Human Factors in Computing Systems

Sketching as a technique to quickly draw something on a piece of paper is used to explore and communicate ideas. This course introduces basic sketching techniques and a visual language which participants can apply in research and practice. It is a hands-...
Data Streams with Bounded Deletions
PODS '18: Proceedings of the 37th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems

Two prevalent models in the data stream literature are the insertion-only and turnstile models. Unfortunately, many important streaming problems require a Θ(log(n)) multiplicative factor more space for turnstile streams than for insertion-only streams. ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

SIGMOD '18: Proceedings of the 2018 International Conference on Management of Data

May 2018

1874 pages

ISBN:9781450347037

DOI:10.1145/3183713

General Chairs:
Gautam Das
University of Texas at Arlington, USA
,
Christopher Jermaine
Rice University, USA
,
Philip Bernstein
Microsoft Research, USA

Copyright © 2018 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGMOD: ACM Special Interest Group on Management of Data

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 May 2018

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Conference

SIGMOD/PODS '18

Sponsor:

SIGMOD

SIGMOD/PODS '18: International Conference on Management of Data

June 10 - 15, 2018

TX, Houston, USA

Acceptance Rates

SIGMOD '18 Paper Acceptance Rate 90 of 461 submissions, 20%;

Overall Acceptance Rate 785 of 4,003 submissions, 20%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

28
Total Citations
View Citations
1,305
Total Downloads

Downloads (Last 12 months)155
Downloads (Last 6 weeks)38

Reflects downloads up to 12 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

An SCao Y(2024)Relative Keys: Putting Feature Explanation into ContextProceedings of the ACM on Management of Data10.1145/36392632:1(1-28)Online publication date: 26-Mar-2024
https://doi.org/10.1145/3639263
Cao YFeng YWang HXie XZhou S(2024)Learning to Sketch: A Neural Approach to Item Frequency Estimation in Streaming DataIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2024.338858946:11(7136-7153)Online publication date: Nov-2024
https://doi.org/10.1109/TPAMI.2024.3388589
Wu YJiang SXu YDong SYang KChen PYang T(2024)Unbiased Real-Time Traffic SketchingIEEE Transactions on Network Science and Engineering10.1109/TNSE.2023.328400411:3(2371-2383)Online publication date: May-2024
https://doi.org/10.1109/TNSE.2023.3284004
Chen Z(2024)Robust Sparse Online Learning through Adversarial Sparsity Constraints2024 9th IEEE International Conference on Smart Cloud (SmartCloud)10.1109/SmartCloud62736.2024.00015(42-47)Online publication date: 10-May-2024
https://doi.org/10.1109/SmartCloud62736.2024.00015
Monath NZaheer MMcCallum ASingh ASun YAkoglu LGunopulos DYan XKumar ROzcan FYe J(2023)Online Level-wise Hierarchical ClusteringProceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3580305.3599455(1733-1745)Online publication date: 6-Aug-2023
https://dl.acm.org/doi/10.1145/3580305.3599455
Wu YJiang SDong SZhong ZChen JHu YYang TUhlig SCui BSingh ASun YAkoglu LGunopulos DYan XKumar ROzcan FYe J(2023)MicroscopeSketch: Accurate Sliding Estimation Using Adaptive ZoomingProceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3580305.3599432(2660-2671)Online publication date: 6-Aug-2023
https://dl.acm.org/doi/10.1145/3580305.3599432
Miao RZhong ZGuo JLi ZYang TCui B(2023)BurstSketch: Finding Bursts in Data StreamsIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2022.322368635:11(11126-11140)Online publication date: 1-Nov-2023
https://doi.org/10.1109/TKDE.2022.3223686
Bagher KLai S(2023)SGX-Stream: A Secure Stream Analytics Framework In SGX-enabled Edge CloudJournal of Information Security and Applications10.1016/j.jisa.2022.10340372(103403)Online publication date: Feb-2023
https://doi.org/10.1016/j.jisa.2022.103403
Zhang MLin SYin L(2023)Local differentially private frequency estimation based on learned sketchesInformation Sciences10.1016/j.ins.2023.119667649(119667)Online publication date: Nov-2023
https://doi.org/10.1016/j.ins.2023.119667
Srinivas VWoodruff DXu ZZhou SLeonardi SGupta A(2022)Memory bounds for the experts problemProceedings of the 54th Annual ACM SIGACT Symposium on Theory of Computing10.1145/3519935.3520069(1158-1171)Online publication date: 9-Jun-2022
https://dl.acm.org/doi/10.1145/3519935.3520069
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents