Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Expressive Languages for Path Queries over Graph-Structured Data

Published: 01 December 2012 Publication History

Abstract

For many problems arising in the setting of graph querying (such as finding semantic associations in RDF graphs, exact and approximate pattern matching, sequence alignment, etc.), the power of standard languages such as the widely studied conjunctive regular path queries (CRPQs) is insufficient in at least two ways. First, they cannot output paths and second, more crucially, they cannot express relationships among paths.
We thus propose a class of extended CRPQs, called ECRPQs, which add regular relations on tuples of paths, and allow path variables in the heads of queries. We provide several examples of their usefulness in querying graph structured data, and study their properties. We analyze query evaluation and representation of tuples of paths in the output by means of automata. We present a detailed analysis of data and combined complexity of queries, and consider restrictions that lower the complexity of ECRPQs to that of relational conjunctive queries. We study the containment problem, and look at further extensions with first-order features, and with nonregular relations that add arithmetic constraints on the lengths of paths and numbers of occurrences of labels.

Supplementary Material

PDF File (a31-barcelo_appendix.pdf)
The proof is given in an electronic appendix, available online in the ACM Digital Library.

References

[1]
Abiteboul, S., Quass, D., McHugh, J., Widom, J., and Wiener, J. 1997. The LOREL query language for semistructured data. Int. J. Digital Libraries 1, 1, 68--88.
[2]
Abiteboul, S., Buneman, P., and Suciu, D. 1999. Data on the Web: From Relations to Semistructured Data and XML. Morgan Kauffman.
[3]
Abulla, P., Jonnson, B., Nilsson, M., and Saksena, M. 2003. A survey of regular model checking. In Proceedings of the 5th International Conference on Concurrence Theory. 35--48.
[4]
Alkhateeb, F., Baget, J.-F., and Euzenat, J. 2008. Constrained regular expressions in SPARQL. In Proceedings of the International Conference on Semantic Web & Web Services. 91--99.
[5]
Alkhateeb, F., Baget, J.-F., and Euzenat, J. 2009. Extending SPARQL with regular expression patterns (for querying RDF). J. Web Semantics 7, 2, 57--73.
[6]
Anyanwu, K. and Sheth, A. 2003. ρ-queries: Enabling querying for semantic associations on the semantic web. In Proceedings of the 12th International World Wide Web Conference. 690--699.
[7]
Anyanwu, K., Maduko, A., and Sheth, A. P. 2007. SPARQ2L: Towards support for subgraph extraction queries in RDF databases. In Proceedings of the 16th International World Wide Web Conference. 797--806.
[8]
Barceló, P., Hurtado, C. A., Libkin, L., and Wood, P. T. 2010. Expressive languages for path queries over graph-structured data. In Proceedings of the 29th ACM Symposium on Principles of Database Systems. 3--14.
[9]
Barceló, P., Pérez, J., and Reutter, J. 2012. Relative expressiveness of nested regular expressions. In Proceedings of the 6th Alberto Mendelzon Workshop on the Foundations of Data Management and the Web. 180--195.
[10]
Barrett, C., Jacob, R., and Marathe, M. 2000. Formal-language-constrained path problems. SIAM J. Comput. 30, 3, 809--837.
[11]
Benedikt, M., Libkin, L., Schwentick, T., and Segoufin, L. 2003. Definable relations and first-order query languages over strings. J. ACM 50, 5, 694--751.
[12]
Berstel, J. 1979. Transductions and Context-Free Languages. B. G. Teubner.
[13]
Blumensath, A. and Grädel, E. 2000. Automatic structures. In Proceedings of the 15th Annual IEEE Symposium on Logic in Computer Science. 51--62.
[14]
Bruyère, V., Hansel, G., Michaux, C., and Villemaire, R. 1994. Logic and p-recognizable sets of integers. Bull. Belgian Math. Soc. 1, 191--238.
[15]
Calvanese, D., De Giacomo, G., Lenzerini, M., and Vardi, M. 2000. Containment of conjunctive regular path queries with inverse. In Proceedings of the 7th International Conference on Principles of Knowledge Representation and Reasoning. 176--185.
[16]
Calvanese, D., De Giacomo, G., Lenzerini, M., and Vardi, M. 2002. Rewriting of regular expressions and regular path queries. J. Comput. Syst. Sci. 64, 3, 443--465.
[17]
Chrobak, M. 1986. Finite automata and unary languages. Theor. Comput. Sci. 47, 2, 149--158.
[18]
Consens, M. and Mendelzon, A. 1990. GraphLog: A visual formalism for real life recursion. In Proceedings of the 9th ACM Symposium on Principles of Database Systems. 404--416.
[19]
Deutsch, A. and Tannen, V. 2001. Optimization properties for classes of conjunctive regular path queries. In Proceedings of the 8th International Workshop on Database Programming Languages. 21--39.
[20]
Elgot, C. and Mezei, J. 1965. On relations defined by generalized finite automata. IBM J. Resear. Dev. 9, 1, 47--68.
[21]
Florescu, D., Levy, A., and Suciu, D. 1998. Query containment for conjunctive queries with regular expressions. In Proceedings of the 17th ACM Symposium on Principles of Database Systems. 139--148.
[22]
Freydenberger, D. and Reidenbach, D. 2010. Bad news on decision problems for patterns. Inf. Comput. 208, 1, 83--96.
[23]
Freydenberger, D. and Schweikardt, N. 2011. Expressiveness and static analysis of extended conjunctive regular path queries. In Proceedings of the 5th Alberto Mendelzon International Workshop on Foundations of Data Management.
[24]
Frougny, C. and Sakarovitch, J. 1991. Rational relations with bounded delay. In Proceedings of the 8th Annual Symposium on Theoretical Aspects of Computer Science. 50--63.
[25]
Grahne, G. and Thomo, A. 2004. Query answering and containment for regular path queries under distortions. In Proceedings of the 3rd International Symposium on the Foundations of Information and Knowledge Systems. 98--115.
[26]
Gusfield, D. 1997. Algorithms on Strings, Trees and Sequences: Computer Science and Computational Biology. Cambridge University Press.
[27]
Holland, D., Braun, U., Maclean, D., Muniswamy-Reddy, K., and Seltzer, M. 2008. Choosing a data model and query language for provenance. In Proceedings of the 2nd International Provenance and Annotation Workshop.
[28]
Ibarra, H., Su, J., Dang, Z., Bultan, T., and Kemmerer, R. 2002. Counter machines and verification problems. Theor. Comput. Sci. 289, 1, 165--189.
[29]
Kanza, Y. and Sagiv, Y. 2001. Flexible queries over semistructured data. In Proceedings of the 20th ACM Symposium on Principles of Database Systems. 40--51.
[30]
Kochut, K. and Janik, M. 2007. SPARQLeR: Extended SPARQL for semantic association discovery. In Proceedings of the 4th European Semantic Web Conference. 145--159.
[31]
Kozen, D. 1977. Lower bounds for natural proof systems. In Proceedings of the 18th Annual Symposium on Foundations of Computer Science. 254--266.
[32]
Lee, W., Raschid, L., Srinivasan, P., Shah, N., Rubin, D., and Noy, N. 2007. Using annotations from controlled vocabularies to find meaningful associations. In Proceedings of the 4th International Workshop on Data Integration in the Life Sciences. 247--263.
[33]
Lehmann, J., Schüpell, J., and Auer, S. 2007. Discovering unknown connections---the DBpedia relationship finder. In Proceedings of the 1st SABRE Conference on Social Semantic Web. 99--110.
[34]
Lenstra, H. 1983. Integer programming in a fixed number of variables. Math. Oper. Resear. 8, 4, 538--548.
[35]
Mendelzon, A. and Wood, P. 1995. Finding regular simple paths in graph databases. SIAM J. Comput. 24, 6, 1235--1258.
[36]
Milo, T. and Suciu, D. 1999. Index structures for path expressions. In Proceedings of the 7th International Conference on Database Theory. 277--295.
[37]
Papadimitriou, C. 1981. On the complexity of integer programming. J. ACM 28, 4, 765--768.
[38]
Pérez, J., Arenas, M., and Gutierrez, C. 2010. nSPARQL: A navigational language for RDF. J. Web Semantics 8, 4, 255--270.
[39]
Robertson, E. L. 1974. Structure of complexity in the weak monadic second-order theories of the natural numbers. In Conference Record of 6th Annual ACM Symposium on Theory of Computing. 161--171.
[40]
Scarpellini, B. 1984. Complexity of subcases of Presburger arithmetic. Trans. AMS 284, 203--218.
[41]
Sheth, A., Aleman-Meza, A. B., et al. 2005. Semantic association identification and knowledge discovery for national security applications. J. Datab. Manage. 16, 1, 33--53.
[42]
Stockmeyer, L. J. 1974. The complexity of decision problems in automata theory and logic. Ph.D. thesis, Massachusetts Institute of Technology.
[43]
To, A. 2009. Unary finite automata vs. arithmetic progressions. Inf. Process. Lett. 109, 17, 1010--1014.
[44]
To, A. 2010. Model checking infinite-state systems: Generic and specific approaches. Ph.D. thesis, LFCS, School of Informatics, University of Edinburgh.
[45]
Verma, K., Seidl, H., and Schwentick, T. 2005. On the complexity of equational Horn clauses. In Proceedings of the 20th International Conference on Automated Deduction (CADE). 337--352.
[46]
Weikum, G., Kasneci, G., Ramanath, M., and Suchanek, F. 2009. Database and information-retrieval methods for knowledge discovery. Commun. ACM 52, 4, 56--64.

Cited By

View all
  • (2025)Explaining results of path queries on graphs: Single-path results for context-free path queriesInformation Systems10.1016/j.is.2024.102475128(102475)Online publication date: Feb-2025
  • (2024)Efficient Betweenness Centrality Computation over Large Heterogeneous Information NetworksProceedings of the VLDB Endowment10.14778/3681954.368200617:11(3360-3372)Online publication date: 30-Aug-2024
  • (2024)Parikh’s Theorem Made SymbolicProceedings of the ACM on Programming Languages10.1145/36329078:POPL(1945-1977)Online publication date: 5-Jan-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Database Systems
ACM Transactions on Database Systems  Volume 37, Issue 4
December 2012
345 pages
ISSN:0362-5915
EISSN:1557-4644
DOI:10.1145/2389241
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 December 2012
Accepted: 01 August 2011
Revised: 01 June 2011
Received: 01 February 2011
Published in TODS Volume 37, Issue 4

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Graph databases
  2. conjunctive queries
  3. regular path queries
  4. regular relations

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)38
  • Downloads (Last 6 weeks)2
Reflects downloads up to 29 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2025)Explaining results of path queries on graphs: Single-path results for context-free path queriesInformation Systems10.1016/j.is.2024.102475128(102475)Online publication date: Feb-2025
  • (2024)Efficient Betweenness Centrality Computation over Large Heterogeneous Information NetworksProceedings of the VLDB Endowment10.14778/3681954.368200617:11(3360-3372)Online publication date: 30-Aug-2024
  • (2024)Parikh’s Theorem Made SymbolicProceedings of the ACM on Programming Languages10.1145/36329078:POPL(1945-1977)Online publication date: 5-Jan-2024
  • (2024)Path Querying in Graph Databases: A Systematic Mapping StudyIEEE Access10.1109/ACCESS.2024.337197612(33154-33172)Online publication date: 2024
  • (2024)A logical approach to graph databasesJournal of Logical and Algebraic Methods in Programming10.1016/j.jlamp.2024.100997141(100997)Online publication date: Oct-2024
  • (2024)Multi-model query languages: taming the variety of big dataDistributed and Parallel Databases10.1007/s10619-023-07433-142:1(31-71)Online publication date: 1-Mar-2024
  • (2024)RPQBench: A Benchmark for Regular Path Queries on Graph DataWeb Information Systems Engineering – WISE 202410.1007/978-981-96-0567-5_25(351-367)Online publication date: 3-Dec-2024
  • (2023)Representing Paths in Graph Database Pattern MatchingProceedings of the VLDB Endowment10.14778/3587136.358715116:7(1790-1803)Online publication date: 8-May-2023
  • (2023)Conjunctive Regular Path Queries under Injective SemanticsProceedings of the 42nd ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems10.1145/3584372.3588664(231-240)Online publication date: 18-Jun-2023
  • (2023)An epistemic approach to model uncertainty in data-graphsInternational Journal of Approximate Reasoning10.1016/j.ijar.2023.108948160:COnline publication date: 1-Sep-2023
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media