Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3373718.3394747acmconferencesArticle/Chapter ViewAbstractPublication PageslicsConference Proceedingsconference-collections
research-article

Space-efficient Query Evaluation over Probabilistic Event Streams

Published: 08 July 2020 Publication History

Abstract

Real-time decision making in IoT applications relies upon space-efficient evaluation of queries over streaming data. To model the uncertainty in the classification of data being processed, we consider the model of probabilistic strings --- sequences of discrete probability distributions over a finite set of events, and initiate the study of space complexity of streaming computation for different classes of queries over such probabilistic strings.
We first consider the problem of computing the probability that a word, sampled from the distribution defined by the probabilistic string read so far, is accepted by a given deterministic finite automaton. We show that this regular pattern matching problem can be solved using space that is only poly-logarithmic in the string length (and polynomial in the size of the DFA) if we are allowed a multiplicative approximation error. Then we show how to generalize this result to quantitative queries specified by additive cost register automata --- these are automata that map strings to numerical values using finite control and registers that get updated using linear transformations. Finally, we consider the case when updates in such an automaton involve tests, and in particular, when there is a counter variable that can be either incremented or decremented but decrements only apply when the counter value is non-zero. In this case, the desired answer depends on the probability distribution over the set of possible counter values that can range from 0 to n for a string of length n. Under a mild assumption, namely probabilities of the individual events are bounded away from 0 and 1, we show that there is an algorithm that can compute all n entries of this probability distribution vector to within additive 1/poly(n) error using space that is only Õ(n). In establishing these results, we introduce several new technical ideas that may prove useful for designing space-efficient algorithms for other query models over probabilistic strings.

References

[1]
Noga Alon, Yossi Matias, and Mario Szegedy. 1999. The Space Complexity of Approximating the Frequency Moments. J. Comput. Syst. Sci. 58, 1 (1999), 137--147.
[2]
Rajeev Alur, Loris D'Antoni, Jyotirmoy Deshmukh, Mukund Raghothaman, and Yifei Yuan. 2013. Regular Functions and Cost Register Automata. In Proceedings of the 28th Annual ACM/IEEE Symposium on Logic in Computer Science. 13--22.
[3]
Kevin Borders, Jonathan Springer, and Matthew Burnside. 2012. Chimera: A Declarative Language for Streaming Network Traffic Analysis. In Proceedings of the 21st USENIX Conference on Security Symposium. 19--19.
[4]
Graham Cormode and Minos Garofalakis. 2007. Sketching Probabilistic Data Streams. In Proceedings of the 2007 ACM SIGMOD International Conference on Management of Data. 281--292.
[5]
T. S. Jayram, Satyen Kale, and Erik Vee. 2007. Efficient Aggregation Algorithms for Probabilistic Data. In Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms. 346--355.
[6]
T. S. Jayram, Andrew McGregor, S. Muthukrishnan, and Erik Vee. 2007. Estimating Statistical Aggregates on Probabilistic Data Streams. In Proceedings of the Twenty-sixth ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems. 243--252.
[7]
Cheqing Jin, Ke Yi, Lei Chen, Jeffrey Xu Yu, and Xuemin Lin. 2010. Sliding-window Top-k Queries on Uncertain Streams. The VLDB Journal 19, 3 (2010), 411--435.
[8]
Eyal Kushilevitz and Noam Nisan. 2006. Communication Complexity. Cambridge University Press.
[9]
Konstantinos Mamouras, Mukund Raghothaman, Rajeev Alur, Zachary G. Ives, and Sanjeev Khanna. 2017. StreamQRE: Modular Specification and Efficient Evaluation of Quantitative Queries over Streaming Data. In Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation. 693--708.
[10]
Michael Mitzenmacher and Eli Upfal. 2005. Probability and Computing: Randomized Algorithms and Probabilistic Analysis. Cambridge University Press.
[11]
S. Muthukrishnan. 2005. Data Streams: Algorithms and Applications. Foundations and Trends in Theoretical Computer Science 1, 2 (2005).
[12]
Vern Paxson. 1999. Bro: A System for Detecting Network Intruders in Real-time. Computer Networks 31, 23-24 (1999), 2435--2463.
[13]
Christopher Ré, Julie Letchner, Magdalena Balazinksa, and Dan Suciu. 2008. Event Queries on Correlated Probabilistic Streams. In Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data. 715--728.
[14]
Yifei Yuan, Dong Lin, Ankit Mishra, Sajal Marwaha, Rajeev Alur, and Boon Thau Loo. 2017. Quantitative Network Monitoring with NetQRE. In Proceedings of the Conference of the ACM Special Interest Group on Data Communication, SIGCOMM. 99--112.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
LICS '20: Proceedings of the 35th Annual ACM/IEEE Symposium on Logic in Computer Science
July 2020
986 pages
ISBN:9781450371049
DOI:10.1145/3373718
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 July 2020

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Probabilistic streams
  2. Query processing over streams
  3. Streaming algorithms

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

Conference

LICS '20
Sponsor:

Acceptance Rates

LICS '20 Paper Acceptance Rate 69 of 174 submissions, 40%;
Overall Acceptance Rate 215 of 622 submissions, 35%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 82
    Total Downloads
  • Downloads (Last 12 months)3
  • Downloads (Last 6 weeks)0
Reflects downloads up to 16 Nov 2024

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media