Ranked XML Processing

Amélie Marian³,
Ralf Schenkel⁴ &
Martin Theobald^5,6

24 Accesses

Synonyms

Aggregation and threshold algorithms for XML; Approximate XML querying; Top-k XML query processing

Definition

When querying collections of XML documents with heterogeneous or complex schemas, existing query languages like XPath or XQuery with their exact-match semantics are often not the perfect choice. Such exact querying languages will typically miss many relevant results that do not conform to the strict formulation of the query.

Top-k query processing for XML data, which focuses on finding the k top-ranked XML elements to an XPath (or XQuery) query with full-text search predicates, is a particularly appropriate query model for querying semi-structured data when the actual content or structure of the underlying data is not fully known. Challenges in processing top-k queries over XML data include scoring individual answers based on how closely they match the query, supporting IR-style vague search over both content and structure, and ranking the kbest answers in an...

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 4,499.99; Price excludes VAT (USA)

Hardcover Book: USD 6,499.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Recommended Reading

Amer-Yahia S, Cho S, Srivastava D. Tree pattern relaxation. In: Advances in Database Technology, Proceedings of the 8th International Conference on Extending Database Technology; 2002. p. 496–513.
Chapter Google Scholar
Amer-Yahia S, Curtmola E, Deutsch A. Flexible and efficient XML search with complex full-text predicates. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 2006. p. 575–86.
Google Scholar
Amer-Yahia S, Koudas N, Marian A, Srivastava D, Toman D. Structure and content scoring for XML. In: Proceedings of the 31st International Conference on Very Large Data Bases; 2005.
Google Scholar
Amer-Yahia S, Lakshmanan LVS, Pandit S. FleXPath: flexible structure and full-text querying for XML. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 2004. p. 83–94.
Google Scholar
Amer-Yahia S, Lalmas M. XML search: languages, INEX and scoring. ACM SIGMOD Rec. 2006;35(4):16–23.
Article Google Scholar
Bruno N, Koudas N, Srivastava D. Holistic twig joins: optimal XML pattern matching. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 2002. p. 310–21.
Google Scholar
Cohen S, Mamou J, Kanza Y, Sagiv Y. XSEarch: a semantic search engine for XML. In: Proceedings of the 29th International Conference on Very Large Data Bases; 2003. p. 45–56.
Chapter Google Scholar
Fagin R, Lotem A, Naor M. Optimal aggregation algorithms for middleware. J Comput Syst Sci. 2003;66(4):614–56.
Article MathSciNet MATH Google Scholar
Fuhr N, Großjohann K. XIRQL: a query language for information retrieval in XML documents. In: Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval; 2001. p. 172–80
Google Scholar
Grust T, van Keulen M, Teubner J. Staircase join: teach a relational DBMS to watch its (axis) steps. In: Proceedings of the 29th International Conference on Very Large Data Bases; 2003. p. 524–5.
Chapter Google Scholar
Guo L, Shao F, Botev C, Shanmugasundaram J. XRank: ranked keyword search over XML documents. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 2003.
Google Scholar
Kaushik R, Krishnamurthy R, Naughton JF, Ramakrishnan R. On the integration of structure indexes and inverted lists. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 2004.
Google Scholar
Kilpeläinen P, Mannila H. Retrieval from hierarchical texts by partial patterns. In: Proceedings of the 16th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval; 1993. p. 214–22.
Google Scholar
Marian A, Amer-Yahia S, Koudas N, Srivastava D. Adaptive processing of top-k queries in XML. In: Proceedings of the 21st International Conference on Data Engineering; 2005. p. 162–73.
Google Scholar
Schenkel R, Theobald A, Weikum G. Semantic similarity search on semistructured data with the XXL search engine. Inf Retr. 2005;8(4):521–45.
Article Google Scholar
Schlieder T. Schema-driven evaluation of approximate tree-pattern queries. In: Advances in database technology, proceedings of the 8th international conference on extending database technology. 2002. p. 514–32.
Chapter Google Scholar
Theobald M, Schenkel R, Weikum G. An efficient and versatile query engine for TopX search. In: Proceedings of the 31st International Conference on Very Large Data Bases; 2005.
Google Scholar
Theobald M, Schenkel R, Weikum G. The TopX DB&IR engine. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 2007. p. 1141–3.
Google Scholar
Theobald A, Weikum G. Adding relevance to XML. In: Proceedings of the 3rd International Workshop on the World Wide Web and Databases; 2000. p. 105–24.
Google Scholar

Download references

Author information

Authors and Affiliations

Computer Science Department, Rutgers University, New Brunswick, NJ, USA
Amélie Marian
Campus II Department IV – Computer Science, Professorship for databases and information systems, University of Trier, Trier, Germany
Ralf Schenkel
Institute of Databases and Information Systems (DBIS), Ulm University, Ulm, Germany
Martin Theobald
Stanford University, Stanford, CA, USA
Martin Theobald

Authors

Amélie Marian
View author publications
You can also search for this author in PubMed Google Scholar
Ralf Schenkel
View author publications
You can also search for this author in PubMed Google Scholar
Martin Theobald
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Amélie Marian .

Editor information

Editors and Affiliations

Georgia Institute of Technology College of Computing, Atlanta, GA, USA
Ling Liu
University of Waterloo School of Computer Science, Waterloo, ON, Canada
M. Tamer Özsu

Section Editor information

Laboratoire d'Informatique de Grenoble, CNRS and LIG, Grenoble, France
Sihem Amer-Yahia

Rights and permissions

Reprints and permissions

Copyright information

About this entry

Cite this entry

Marian, A., Schenkel, R., Theobald, M. (2018). Ranked XML Processing. In: Liu, L., Özsu, M.T. (eds) Encyclopedia of Database Systems. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-8265-9_778

Download citation

DOI: https://doi.org/10.1007/978-1-4614-8265-9_778
Published: 07 December 2018
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-8266-6
Online ISBN: 978-1-4614-8265-9
eBook Packages: Computer ScienceReference Module Computer Science and Engineering

Publish with us

Policies and ethics