Abstract
We study the problem of evaluating xpath queries over xml data that is stored in an rdbms via schema-based shredding. The interaction between recursion (descendants-axis) in xpath queries and recursion in dtds makes it challenging to answer xpath queries using rdbms. We present a new approach to translating xpath queries into sql queries based on a notion of extended XP ath expressions and a simple least fixpoint (lfp) operator. Extended xpath expressions are a mild extension of xpath, and the lfp operator takes a single input relation and is already supported by most commercial rdbms. We show that extended xpath expressions are capable of capturing both dtd recursion and xpath queries in a uniform framework. Furthermore, they can be translated into an equivalent sequence of sql queries with the lfp operator. We present algorithms for rewriting xpath queries over a (possibly recursive) dtd into extended xpath expressions and for translating extended xpath expressions to sql queries, as well as optimization techniques. The novelty of our approach consists in its capability to answer a large class of xpath queries by means of only low-end rdbms features already available in most rdbms, as well as its flexibility to accommodate existing relational query optimization techniques. In addition, these translation algorithms provide a solution to query answering for certain (possibly recursive) xml views of xml data. Our experimental results verify the effectiveness of our techniques.
Similar content being viewed by others
References
Afanasiev, L., Grust, T., Marx, M., Rittinger, J., Teubner, J.: An inflationary fixed point in XQuery. In: Proc. of ICDE (2008)
Agrawal, R., Devanbu, P.: Moving selections into linear least fixpoint queries. In: Proc. of ICDE (1988)
Aho, A., Ullman, J.: Universality of data retrieval languages. In: Proc. of POPL (1979)
Amer-Yahia, S., Cho, S., Lakshmanan, L., Srivistava, D.: Minimization of tree pattern queries. In: Proc. of SIGMOD (2001)
Bancilhon, F., Maier, D., Sagiv, Y., Ullman, J.: Magic sets and other strange ways to implement logic programs. In: Proc. of PODS (1986)
Bancilhon, F., Ramakrishnan, R.: An amateur’s introduction to recursive query processing strategies. In: Proc. of SIGMOD (1986)
Barbosa, D., Freire, J., Mendelzon, A.: Designing information-preserving mapping schemes for XML. In: Prof. of VLDB (2005)
Beeri, C., Ramakrishnan, R.: On the power of magic. J. Log. Program 10 (1991)
Benedikt, M., Fan, W., Geerts, F.: XPath satisfiability in the presence of DTDs. J. ACM 55(2) (2008)
BIOML. BIOpolymer Markup Language. http://xml.coverpages.org/BIOML-XML-DTD.txt
Boncz, P.A., Grust, T., van Keulen, M., Manegold, S., Rittinger, J., Teubner, J.: MonetDB/XQuery: a fast XQuery processor powered by a relational engine. In: Proc. of SIGMOD (2004)
Choi, B.: What are real DTDs like. In: Proc. of WebDB (2002)
Choi, B., Cong, G., Fan, W., Viglas, S.: Updating recursive XML views of relations. In: Prof. of ICDE (2007)
Christophides, V., Cluet, S., Moerkotte, G.: Evaluating queries with generalized path expressions. In: Proc. of SIGMOD (1996)
Clark, J., DeRose, S.: XML path language (XPath). W3C Recommendation, Nov 1999
DeHaan, D., Toman, D., Consens, M., Ozsu, T.: Comprehensive XQuery to SQL translation using dynamic interval encoding. In: Proc. of SIGMOD (2003)
Deutsch, A., Tannen, V.: MARS: A system for publishing XML from mixed and redundant storage. In: Proc. of VLDB (2003)
Ehrenfeucht, A., Zeiger, P.: Complexity measures for regular expressions. In: Proc. of STC’74 (1974)
EXSLT.: http://www.exslt.org/dyn/functions/closure/index.html
Fan, W., Bohannon, P.: Information preserving XML schema embedding. TODS 33(1) (2008)
Fan, W., Chan, C.-Y., Garofalakis, M.: Secure XML querying with security views. In: Proc. of SIGMOD (2004)
Fan, W., Geerts, F., Jia, X., Kementsietsidis, A.: Rewriting regular XPath queries on XML views. In: Proc. of ICDE (2007)
Fan, W., Geerts, F., Neven, F.: Expressiveness and complexity of XML publishing transducers. TODS (2008)
Fan, W., Yu, J.X., Lu, H., Lu, J., Rastogi, R.: Query translation from XPath to SQL in the presence of recursive DTDs. In: Proc. of VLDB (2005)
Fernandez, M., Suciu, D.: Optimizing regular path expression using graph schemas. In: Proc. of ICDE (1998)
Fernandez, M.F., Morishima, A., Suciu, D.: Efficient evaluation of XML middleware queries. In: Proc. of SIGMOD (2001)
GedML.: Genealogy Markup Language. http://xml.coverpages.org/gedml-dtd9808.txt
Georgiadis, H., Vassalos, V.: Improving the efficiency of XPath execution on relational systems. In: Proc. of EBDT (2006)
Georgiadis, H., Vassalos, V.: XPath on steroids: exploiting relational engines for XPath performance. In: Proc. of SIGMOD (2007)
Graefe, G.: Query evaluation techniques for large databases. ACM Comput. Surv. 25(2) (1993)
Grust T., Keulen M., van Teubner J.: Accelerating XPath evaluation in any RDBMS. TODS 29, 91–131 (2004)
Halevy, A.Y.: Theory of answering queries using views. SIGMOD Record 29(4) (2001)
IBM.: DB2 XML Extender. http://www-3.ibm.com/software/data/db2/extended/xmlext/index.html
Jain, S., Mahajan, R., Suciu, D.: Translating XSLT programs to efficient SQL querie. In: Proc. of WWW (2002)
Kambayashi, Y.: Processing cyclic queries. In: Query processing in database systems, pp. 63–78. Springer, Heidelberg (1985)
Kha, D.D., Yoshikawa, M., Uemura, S.: An XML indexing structure with relative region coordinate. In: Proc. of ICDE (2001)
Kim, Y.-C., Kim, W., Dale, A.: Cyclic query processing in object-oriented databases. In: Proc. of ICDE (1989)
Kolaitis, P.G.: Schema mappings, data exchange, and metadata management. In: Prof. of PODS (2006)
Krishnamurthy, R., Chakaravarthy, V.T., Kaushik, R., Naughton, J.: Recursive XML schemas, recursive XML queries, and relational storage: XML-to-SQL query translation. In: Proc. of ICDE (2004)
Krishnamurthy, R., Kaushik, R., Naughton, J.: XML-SQL query translation literature: The state of the art and open problems. In: Proc. of Xsym (2003)
Krishnamurthy, R., Kaushik, R., Naughton, J.: Efficient XML-to-SQL query translation: Where to add the intelligence. In: Proc. of VLDB (2004)
Krishnamurthy, R., Kaushik, R., Naughton, J.: XML views as integrity constraints and their use in query translation. In: Proc. of ICDE (2005)
Kunen, I.K., Suciu, D.: A scalable algorithm for query minimization. Technical Report, University of Washington (2004)
Lenzerini, M.: Data integration: A theoretical perspective. In: Proc. of PODS (2002)
Li, Q., Moon, B.: Indexing and querying XML data for regular path expressions. In: Proc. of VLDB (2001)
Likin, L.: Logics for unranked trees: An overview. Log. Meth. Comput. Sci. 2(3) (2006)
Manolescu, I., Florescu, D., Kossmann, D.: Answering XML queries on heterogeneous data sources. In: Proc. of VLDB (2001)
Marx, M.: XPath with conditional axis relations. In: Proc. of EDBT (2004)
Microsoft.: SQLXML and XML Mapping Technologies. http://msdn.microsoft.com/sqlxml/default.asp
Mishra, P., Eich, M.H.: Join processing in relational databases. ACM Comput. Surv. 24(1) (1992)
Nunn, M.: An Overview of SQL Server 2005 for the Database Developer, (2004). http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnsql90/html/sql_ovyukondev.asp
Oracle.: Oracle9i XML Database Developer’s Guide—Oracle XML DB Release 2. http://otn.oracle.com/tech/xmldb/content.html
Papakonstantinou, Y., Vianu, V.: Type inference for views of semistructured data. In: Proc. of PODS (2000)
Roy, P., Seshadri, S., Sudarshan, S., Bhobe, S.: Efficient algorithms for multi query optimization. In: Proc. of SIGMOD (2000)
Saxon.: http://www.saxonica.com/
Shan, M.-C., Neimat, M.-A.: Optimization of relational algebra expressions containing recursion operators. In: Proc. of ACM Annual Computer Science Conference (1999)
Shanmugasundaram, J., Kiernan, J., Shekita, E.J., Fan, C., Funderburk, J.: Querying XML views of relational data. In: Proc. of VLDB (2001)
Shanmugasundaram, J., Shekita, E., Barr, R., Carey, M., Lindsay, B., Pirahesh, H., Reinwald, B.: A general techniques for querying XML documents using a relational database system. SIGMOD Record 30(3) (2001)
Shanmugasundaram, J., Tufte, K., He, G., Zhang, C., DeWitt, D., Naughton, J.: Relational databases for querying XML documents: Limitations and opportunities. In: Proc. of VLDB (1999)
Silberstein, A., He, H., Yi, K., Yang, J.: BOXes: Efficient maintenance of order-based labeling for dynamic XML data. In: Proc. of ICDE (2005)
Tarjan R.E.: Fast algorithms for solving path problems. J. ACM 28(3), 594–614 (1981)
Tatarinov, I., Viglas, S., Beyer, K.S., Shanmugasundaram, J., Shekita, E.J., Zhang, C.: Storing and querying ordered XML using a relational database system. In: Proc. of SIGMOD (2002)
Thompson, H. et al.: XML Schema. W3C Working Draft, May 2001. http://www.w3.org/XML/Schema
Zhang, C., Naughton, J., DeWitt, D.J., Luo, Q., Lohman, G.M.: On supporting containment queries in relational database management systems. In: Proc. of SIGMOD’01 (2001)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Fan, W., Yu, J.X., Li, J. et al. Query translation from XPath to SQL in the presence of recursive DTDs. The VLDB Journal 18, 857–883 (2009). https://doi.org/10.1007/s00778-008-0131-0
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00778-008-0131-0