Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

The Value of Multiple Read/Write Streams for Approximating Frequency Moments

Published: 01 January 2012 Publication History

Abstract

We consider the read/write streams model, an extension of the standard data stream model in which an algorithm can create and manipulate multiple read/write streams in addition to its input data stream. Like the data stream model, the most important parameter for this model is the amount of internal memory used by such an algorithm. The other key parameters are the number of streams the algorithm uses and the number of passes it makes on these streams. We consider how the addition of multiple streams impacts the ability of algorithms to approximate the frequency moments of the input stream.
We show that any randomized read/write stream algorithm with a fixed number of streams and a sublogarithmic number of passes that produces a constant factor approximation of the k-th frequency moment Fk of an input sequence of length of at most N from {1,..., N} requires space Ω(N 1−4/k−δ) for any δ > 0. For comparison, it is known that with a single read-only one-pass data stream there is a randomized constant-factor approximation for Fk using Õ(N1−2/k) space, and that by sorting, which can be done deterministically in O(log N) passes using 3 read/write streams, Fk can be computed exactly. Therefore, although the ability to manipulate multiple read/write streams can add substantial power to the data stream model, with a sublogarithmic number of passes this does not significantly improve the ability to approximate higher frequency moments efficiently. We obtain our results by showing a new connection between the read/write streams model and the multiparty number-in-hand communication model.

References

[1]
Alon, N., Matias, Y., and Szegedy, M. 1999. The space complexity of approximating the frequency moments. J. Comput. Syst. Sci. 58, 1, 137--147.
[2]
Babcock, B., Babu, S., Datar, M., Motwani, R., and Widom, J. 2002. Models and issues in data stream systems. In Proceedings of the 21st Annual ACM Symposium on Principles of Database Systems. 1--16.
[3]
Bar-Yossef, Z., Jayram, T., Kumar, R., and Sivakumar, D. 2004. An information statistics approach to data stream and communication complexity. J. Comput. Syst. Sci. 68, 4, 702--732.
[4]
Beame, P., Blais, E., and Huynh-Ngoc, D.-T. 2009. Longest common subsequences in sets of permutations. Tech. rep. arXiv:R0904.1615v1 {math.CO}.
[5]
Beame, P. and Huynh-Ngoc, D.-T. 2008. On the value of multiple read/write streams for approximating frequency moments. In Proceedings of the 49th Annual Symposium on Foundations of Computer Science. IEEE, 499--508.
[6]
Beame, P., Jayram, T. S., and Rudra, A. 2007. Lower bounds for randomized read/write stream algorithms. In Proceedings of the 39th Annual ACM Symposium on Theory of Computing. 689--698.
[7]
Bhuvanagiri, L., Ganguly, S., Kesh, D., and Saha, C. 2006. Simpler algorithms for estimating frequency moments of data streams. In Proceedings of the 17th Annual ACM-SIAM Symposium on Discrete Algorithms. 708--713.
[8]
Chakrabarti, A., Khot, S., and Sun, X. 2003. Near-optimal lower bounds on the multi-party communication complexity of set disjointness. In Proceedings of the 18th Annual IEEE Conference on Computational Complexity. 107--117.
[9]
Chakrabarty, A. and Regev, O. 2011. An optimal lower bound on the communication complexity of gap-hamming-distance. In Proceedings of the 43rd Annual ACM Symposium on Theory of Computing. 51--60.
[10]
Grohe, M. and Schweikardt, N. 2005. Lower bounds for sorting with few random accesses to external memory. In Proceedings of the 24th Annual ACM Symposium on Principles of Database Systems. 238--249.
[11]
Grohe, M., Hernich, A., and Schweikardt, N. 2006. Randomized computations on large data sets: Tight lower bounds. In Proceedings of the 25th Annual ACM Symposium on Principles of Database Systems. 243--252.
[12]
Grohe, M., Hernich, A., and Schweikardt, N. 2009. Lower bounds for processing data with few random accesses to external memory. J. ACM 56, 3, 1--58.
[13]
Gronemeier, A. 2009. Asymptotically optimal lower bounds on the NIH-multi-party information complexity of the AND-function and disjointness. In Proceedings of the 26th Annual Symposium on Theoretical Aspects of Computer Science. 505--516.
[14]
Indyk, P. and Woodruff, D. 2003. Tight lower bounds for the distinct elements problem. In Proceedings of the 44th Annual Symposium on Foundations of Computer Science. IEEE, 283--292.
[15]
Indyk, P. and Woodruff, D. P. 2005. Optimal approximations of frequency moments of data streams. In Proceedings of the 37th Annual ACM Symposium on Theory of Computing. 202--208.
[16]
Jayram, T. S. 2009. Hellinger strikes back: A note on the multi-party information complexity of and. In Proceedings of APPROX-RANDOM. Lecture Notes in Computer Science Series, vol. 5687, Springer, 562--573.
[17]
Muthukrishnan, S. 2006. Data streams: Algorithms and applications. Found. Trends Theor. Comput. Sci. 1, 2.
[18]
Razborov, A. A. 1992. On the distributional complexity of disjointness. Theor. Comput. Sci. 106, 2, 385--390.
[19]
Saks, M. E. and Sun, X. 2002. Space lower bounds for distance approximation in the data stream model. In Proceedings of the 34th Annual ACM Symposium on Theory of Computing. 360--369.
[20]
Steele, J. M. 1995. Variations on the monotone subsequence problem of Erdös and Szekeres. In Discrete Probability and Algorithms, Aldous, Diaconis, and Steele Eds., Springer, 111--132.
[21]
Steele, J. M. 1997. Probability Theory and Combinatorial Optimization. SIAM, Philadelphia, PA.
[22]
Woodruff, D. 2004. Optimal space lower bounds for all frequency moments. In Proceedings of the 15th Annual ACM-SIAM Symposium on Discrete Algorithms. 167--175.

Cited By

View all
  • (2017)Adaptive matrix vector productProceedings of the Twenty-Eighth Annual ACM-SIAM Symposium on Discrete Algorithms10.5555/3039686.3039819(2044-2060)Online publication date: 16-Jan-2017
  • (2016)Twins in words and long common subsequences in permutationsIsrael Journal of Mathematics10.1007/s11856-016-1323-8213:1(183-209)Online publication date: 15-Apr-2016
  • (2014)Cryptography with Streaming AlgorithmsAdvances in Cryptology – CRYPTO 201410.1007/978-3-662-44381-1_4(55-70)Online publication date: 2014

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Computation Theory
ACM Transactions on Computation Theory  Volume 3, Issue 2
January 2012
99 pages
ISSN:1942-3454
EISSN:1942-3462
DOI:10.1145/2077336
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 January 2012
Accepted: 01 November 2011
Revised: 01 September 2011
Received: 01 October 2010
Published in TOCT Volume 3, Issue 2

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Data streams
  2. communication complexity
  3. external-memory algorithms
  4. frequency moments
  5. read/write streams

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)1
Reflects downloads up to 14 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2017)Adaptive matrix vector productProceedings of the Twenty-Eighth Annual ACM-SIAM Symposium on Discrete Algorithms10.5555/3039686.3039819(2044-2060)Online publication date: 16-Jan-2017
  • (2016)Twins in words and long common subsequences in permutationsIsrael Journal of Mathematics10.1007/s11856-016-1323-8213:1(183-209)Online publication date: 15-Apr-2016
  • (2014)Cryptography with Streaming AlgorithmsAdvances in Cryptology – CRYPTO 201410.1007/978-3-662-44381-1_4(55-70)Online publication date: 2014

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media