Nothing Special   »   [go: up one dir, main page]

skip to main content
10.5555/1182635.1164154acmconferencesArticle/Chapter ViewAbstractPublication PagesvldbConference Proceedingsconference-collections
Article

An algebraic query model for effective and efficient retrieval of XML fragments

Published: 01 September 2006 Publication History

Abstract

Finding a suitable fragment of interest in a nonschematic XML document with a simple keyword search is a complex task. To deal with this problem, this paper proposes a theoretical framework with a focus on an algebraic query model having a novel query semantics. Based on this semantics, XML fragments that look meaningful to a keyword-based query are effectively retrieved by the operations defined in the model. In contrast to earlier work, our model supports filters for restricting the size of a query result, which otherwise may contain a large number of potentially irrelevant fragments. We introduce a class of filters having a special property that enables significant reduction in query processing cost. Many practically useful filters fall in this class and hence, the proposed model can be efficiently applied to real-world XML documents. Several other issues regarding algebraic manipulation of the operations defined in our query model are also formally discussed.

References

[1]
{1} Shurug Al-Khalifa, Cong Yu, and H. V. Jagadish. Querying structured text in an XML database. In SIGMOD 2003, pages 4-15, 2003.
[2]
{2} G. Bhalotia, C. Nakhe, A. Hulgeri, S. Chakrabarti, and S. Sudarshan. Keyword searching and browsing in databases using BANKS. In ICDE, pages 431-440, 2002.
[3]
{3} Charles L. A. Clarke. Controlling overlap in content-oriented XML retrieval. In SIGIR, pages 314-321, 2005.
[4]
{4} S. Cohen, Y. Kanza, and B. Kimelfeld. Interconnection semantics for keyword search in XML. In Proc. of CIKM, pages 389-396, 2005.
[5]
{5} S. Cohen, J. Mamou, Y. Kanza, and Y. Sagiv. XSEarch: A semantic search engine for XML. In Proc. of 29th VLDB, pages 45-56, 2003.
[6]
{6} D. Florescu, D. Kossman, and I. Manolescu. Integrating keyword search into XML query processing. In International World Wide Web Conference, pages 119-135, 2000.
[7]
{7} L. Guo, F. Shao, C. Botev, and J. Shanmugasundaram. XRank: ranked keyword search over XML documents. In SIGMOD, pages 16-27. ACM, June 2003.
[8]
{8} K. Hatano, H. Kinutani, T. Amagasa, Y. Mori, M. Yoshikawa, and S. Uemura. Analyzing the properties of XML fragments decomposed from the INEX document collection. In INEX, pages 168-182, 2004.
[9]
{9} V. Hristidis, Y. Papakonstantinou, and A. Balmin. Keyword proximity search on XML graphs. In ICDE, pages 367-378. IEEE, 2003.
[10]
{10} G. Kazai, M. Lalmas, and A. P. de Vries. The overlap problem in content-oriented XML retrieval evaluation. In SIGIR, pages 72-79, 2004.
[11]
{11} W.S. Li, K. S. Candan, Q. Vu, and D. Agrawal. Retrieving and organizing web pages by 'Information Unit'. In Tenth International WWW Conference, Hong Kong, China, pages 230-244, 2001.
[12]
{12} Y. Li, C. Yu, and H. V. Jagadish. Schema-free XQuery. In Proc. of 30th VLDB, pages 72-83, 2004.
[13]
{13} S. Pradhan. A framework for the relational implementation of tree algebra to retrieve structured document fragments. In 5th Int'l Conf. on Web Information Systems Engineering, pages 206-217. Springer-Verlag, Nov 2004.
[14]
{14} S. Pradhan and K. Tanaka. Retrieval of relevant portions of structured documents. In Proc. 15th Int'l Conf. of Database and Expert Systems Applications, pages 328-338. Springer-Verlag, Aug-Sep 2004.
[15]
{15} A. Schmidt, M. Kersten, and M. Windhouwer. Querying XML documents made easy: Nearest concept queries. In ICDE, pages 321-329, 2001.
[16]
{16} A. Theobald and G. Weikum. The index-based XXL search engine for querying XML data with relevance ranking. In EDBT 2002: 8th International Conference on Extending Database Technology, pages 477-495. Springer-Verlag, 2002.
[17]
{17} Martin Theobald, Ralf Schenkel, and Gerhard Weikum. An efficient and versatile query engine for TopX search. In VLDB, pages 625-636, 2005.
[18]
{18} J. D. Ullman. Principles of Database and Knowledge-Base Systems Vol. II. Computer Science Press, 1989.
[19]
{19} W3C. XQuery 1.0: An XML query language. http://www.w3.org/TR/xquery/.
[20]
{20} Y. Xu and Y. Papakonstantinou. Efficient keyword search for smallest LCAs in XML databases. In SIGMOD, pages 527-538. ACM, June 2005.

Cited By

View all
  • (2009)Towards an integrated framework for querying collection of heterogeneous dataProceedings of the 3rd International Conference on Ubiquitous Information Management and Communication10.1145/1516241.1516252(51-57)Online publication date: 15-Feb-2009
  • (2007)Towards a novel desktop search techniqueProceedings of the 18th international conference on Database and Expert Systems Applications10.5555/2395856.2395883(192-201)Online publication date: 3-Sep-2007
  • (2007)OOXsearchProceedings of the 24th British national conference on Databases10.5555/1770274.1770286(82-100)Online publication date: 3-Jul-2007
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
VLDB '06: Proceedings of the 32nd international conference on Very large data bases
September 2006
1269 pages

Sponsors

  • SIGMOD: ACM Special Interest Group on Management of Data
  • K.I.S.S. SIG on Databases
  • AJU Information Technology Co., Ltd
  • US Army ITC-PAC Asian Research Office
  • Google Inc.
  • The Database Society of Japan
  • Samsung SOS
  • Advanced Information Technology Research Center
  • Naver
  • Microsoft: Microsoft
  • Korea Info Sci Society: Korea Information Science Society
  • SK telecom
  • Systems Applications Products
  • ORACLE: ORACLE
  • International Business Management
  • Air Force Office of Scientific Research/Asian Office of Aerospace R&D
  • Kosef
  • Kaist
  • LG Electronics
  • CCF-DBS

Publisher

VLDB Endowment

Publication History

Published: 01 September 2006

Check for updates

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)0
Reflects downloads up to 14 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2009)Towards an integrated framework for querying collection of heterogeneous dataProceedings of the 3rd International Conference on Ubiquitous Information Management and Communication10.1145/1516241.1516252(51-57)Online publication date: 15-Feb-2009
  • (2007)Towards a novel desktop search techniqueProceedings of the 18th international conference on Database and Expert Systems Applications10.5555/2395856.2395883(192-201)Online publication date: 3-Sep-2007
  • (2007)OOXsearchProceedings of the 24th British national conference on Databases10.5555/1770274.1770286(82-100)Online publication date: 3-Jul-2007
  • (2007)Efficient keyword search over data-centric XML documentsProceedings of the joint 9th Asia-Pacific web and 8th international conference on web-age information management conference on Advances in data and web management10.5555/1769708.1769772(491-502)Online publication date: 16-Jun-2007
  • (2007)Effective keyword search for valuable lcas over xml documentsProceedings of the sixteenth ACM conference on Conference on information and knowledge management10.1145/1321440.1321447(31-40)Online publication date: 6-Nov-2007

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media