Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1007/11523468_87guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Tight lower bounds for query processing on streaming and external memory data

Published: 11 July 2005 Publication History

Abstract

We study a clean machine model for external memory and stream processing. We show that the number of scans of the external data induces a strict hierarchy (as long as work space is sufficiently small, e.g., polylogarithmic in the size of the input). We also show that neither joins nor sorting are feasible if the product of the number r(n) of scans of the external memory and the size s(n) of the internal memory buffers is sufficiently small, e.g., of size $o(\sqrt[n]{5})$. We also establish tight bounds for the complexity of XPath evaluation and filtering.

References

[1]
S. Abiteboul, R. Hull, and V. Vianu. Foundations of Databases. Addison-Wesley, 1995.
[2]
G. Aggarwal, M. Datar, S. Rajagopalan, and M. Ruhl. On the streaming model augmented with a sorting primitive. In Proc. FOCS'04, pages 540-549.
[3]
N. Alon, Y. Matias, and M. Szegedy. The space complexity of approximating the frequency moments. Journal of Computer and System Sciences, 58:137-147, 1999.
[4]
A. Arasu, B. Babcock, T. Green, A. Gupta, and J. Widom. "Characterizing Memory Requirements for Queries over Continuous Data Streams". In Proc. PODS'02, pages 221-232, 2002.
[5]
B. Babcock, S. Babu, M. Datar, R. Motwani, and J. Widom. Models and issues in data stream systems. In Proc. PODS'02, pages 1-16.
[6]
Z. Bar-Yossef, M. Fontoura, and V. Josifovski. "On the Memory Requirements of XPath Evaluation over XML Streams". In Proc. PODS'04, pages 177-188, 2004.
[7]
A. Brüggemann-Klein, M. Murata, and D. Wood. "Regular Tree and Regular Hedge Languages over Non-ranked Alphabets: Version 1, April 3, 2001". Technical Report HKUST-TCSC-2001-05, Hong Kong Univ. of Science and Technology, 2001.
[8]
J.-E. Chen and C.-K. Yap. "Reversal Complexity". SIAM J. Comput., 20(4):622- 638, Aug. 1991.
[9]
J. Doner. "Tree Acceptors and some of their Applications". Journal of Computer and System Sciences, 4:406-451, 1970.
[10]
P. Duris, Z. Galil, and G. Schnitger. Lower bounds on communication complexity. Information and Computation, 73:1-22, 1987. Journal version of STOC'84 paper.
[11]
G. Gottlob and C. Koch. "Monadic Datalog and the Expressive Power of Web Information Extraction Languages". Journal of the ACM, 51(1):74-113, 2004.
[12]
G. Gottlob, C. Koch, and R. Pichler. "Efficient Algorithms for Processing XPath Queries". In Proc. VLDB 2002, pages 95-106, Hong Kong, China, 2002.
[13]
G. Gottlob, C. Koch, and R. Pichler. "The Complexity of XPath Query Evaluation". In Proc. PODS'03, pages 179-190, San Diego, California, 2003.
[14]
G. Graefe. "Query Evaluation Techniques for Large Databases". ACM Computing Surveys, 25(2):73-170, June 1993.
[15]
T. J. Green, G. Miklau, M. Onizuka, and D. Suciu. "Processing XML Streams with Deterministic Automata". In Proc. ICDT'03, 2003.
[16]
M. Grohe, C. Koch, and N. Schweikardt. Tight lower bounds for query processing on streaming and external memory data. Technical report CoRR cs.DB/0505002, 2005. Full version of ICALP'05 paper.
[17]
M. Grohe and N. Schweikardt. Lower bounds for sorting with few random accesses to external memory. In Proc. PODS, 2005. To appear.
[18]
M. Henzinger, P. Raghavan, and S. Rajagopalan. Computing on data streams. In External memory algorithms, volume 50, pages 107-118. DIMACS Series In Discrete Mathematics And Theoretical Computer Science, 1999.
[19]
J. E. Hopcroft and J. D. Ullman. Some results on tape-bounded Turing machines. Journal of the ACM, 16(1):168-177, 1969.
[20]
C. Koch. "Efficient Processing of Expressive Node-Selecting Queries on XML Data in Secondary Storage: A Tree Automata-based Approach". In Proc. VLDB 2003, pages 249-260, 2003.
[21]
E. Kushilevitz and N. Nisan. Communication Complexity. Cambridge Univ. Press, 1997.
[22]
U. Meyer, P. Sanders, and J. Sibeyn, editors. Algorithms for Memory Hierarchies, volume 2832 of Lecture Notes in Computer Science. Springer-Verlag, 2003.
[23]
J. Munro and M. Paterson. Selection and sorting with limited storage. Theoretical Computer Science, 12:315-323, 1980.
[24]
S. Muthukrishnan. Data streams: algorithms and applications. In Proc. 14th SODA, pages 413-413, 2003.
[25]
A. Neumann and H. Seidl. "Locating Matches of Tree Patterns in Forests". In Proc. 18th FSTTCS, LNCS 1530, pages 134-145, 1998.
[26]
F. Neven and J. van den Bussche. "Expressiveness of Structured Document Query Languages Based on Attribute Grammars". J. ACM, 49(1):56-100, Jan. 2002.
[27]
R. Ramakrishnan and J. Gehrke. Database Management Systems. McGraw-Hill, 2002.
[28]
L. Segoufin. "Typing and Querying XML Documents: Some Complexity Bounds". In Proc. PODS'03, pages 167-178, 2003.
[29]
L. Segoufin and V. Vianu. "Validating Streaming XML Documents". In Proc. PODS'02, 2002.
[30]
J. Thatcher and J. Wright. "Generalized Finite Automata Theory with an Application to a Decision Problem of Second-order Logic". Math. Syst. Theory, 2(1):57-81, 1968.
[31]
P. van Emde Boas. "Machine Models and Simulations". In J. van Leeuwen, editor, Handbook of Theoretical Computer Science, volume 1, chapter 1, pages 1-66. Elsevier Science Publishers B.V., 1990.
[32]
J. Vitter. External memory algorithms and data structures: Dealing with massive data. ACM Computing Surveys, 33(2):209-271, June 2001.
[33]
World Wide Web Consortium. "XQuery 1.0 and XPath 2.0 Formal Semantics. W3C Working Draft (Aug. 16th 2002), 2002. http://www.w3.org/XML/Query.
[34]
A. Yao. Some complexity questions related to distributive computing. In Proc. 11th STOC, pages 209-213, 1979.

Cited By

View all

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings
ICALP'05: Proceedings of the 32nd international conference on Automata, Languages and Programming
July 2005
1476 pages
ISBN:3540275800
  • Editors:
  • Luís Caires,
  • Giuseppe F. Italiano,
  • Luís Monteiro,
  • Catuscia Palamidessi,
  • Moti Yung

Sponsors

  • Fundacao para a Ciencia e Tecnologia
  • FCT: Foundation for Science and Technology
  • Centro de Lógica e Computação/IST/UTL: Centro de Lógica e Computação/IST/UTL

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 11 July 2005

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 27 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2011)Streamable fragments of forward XPathProceedings of the 16th international conference on Implementation and application of automata10.5555/2032366.2032369(3-15)Online publication date: 13-Jul-2011
  • (2009)Trading off space for passes in graph streaming problemsACM Transactions on Algorithms10.1145/1644015.16440216:1(1-17)Online publication date: 28-Dec-2009
  • (2009)Efficient algorithms for descendant-only tree pattern queriesInformation Systems10.1016/j.is.2009.03.01034:7(602-623)Online publication date: 1-Nov-2009
  • (2007)Efficient algorithms for the tree homeomorphism problemProceedings of the 11th international conference on Database programming languages10.5555/1783534.1783538(17-31)Online publication date: 23-Sep-2007
  • (2007)Efficient and expressive tree filtersProceedings of the 27th international conference on Foundations of software technology and theoretical computer science10.5555/1781794.1781834(461-472)Online publication date: 12-Dec-2007
  • (2007)Lower bounds for randomized read/write stream algorithmsProceedings of the thirty-ninth annual ACM symposium on Theory of computing10.1145/1250790.1250891(689-698)Online publication date: 11-Jun-2007
  • (2007)Tight lower bounds for query processing on streaming and external memory dataTheoretical Computer Science10.1016/j.tcs.2007.02.062380:1-2(199-217)Online publication date: 20-Jul-2007
  • (2007)Database query processing using finite cursor machinesProceedings of the 11th international conference on Database Theory10.1007/11965893_20(284-298)Online publication date: 10-Jan-2007
  • (2006)Randomized computations on large data setsProceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems10.1145/1142351.1142387(243-252)Online publication date: 26-Jun-2006
  • (2006)Processing queries on tree-structured data efficientlyProceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems10.1145/1142351.1142382(213-224)Online publication date: 26-Jun-2006
  • Show More Cited By

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media