Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/2463676.2463699acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

MESSIAH: missing element-conscious SLCA nodes search in XML data

Published: 22 June 2013 Publication History

Abstract

Keyword search for smallest lowest common ancestors (SLCAs) in XML data has been widely accepted as a meaningful way to identify matching nodes where their subtrees contain an input set of keywords. Although SLCA and its variants (e.g.,MLCA) perform admirably in identifying matching nodes, surprisingly, they perform poorly for searches on irregular schemas that have missing elements, that is, (sub)elements that are optional, or appear in some instances of an element type but not all (e.g., a "population" subelement in a "city" element might be optional, appearing when the population is known and absent when the population is unknown). In this paper, we generalize the SLCA search paradigm to support queries involving missing elements. Specifically, we propose a novel property called optionality resilience that specifies the desired behaviors of an XML keyword search (XKS) approach for queries involving missing elements. We present two variants of a novel algorithm called MESSIAH (Missing Element-conSciouS hIgh-quality SLCA searcH), which are optionality resilient to irregular documents. MESSIAH logically transforms an XML document to a minimal full document where all missing elements are represented as empty elements, i.e., the irregular schema is made "regular", and then employs efficient strategies to identify partial and complete full SLCA nodes (SLCA nodes in the full document) from it. Specifically, it generates the same SLCA nodes as any state-of-the-art approach when the query does not involve missing elements but avoids irrelevant results when missing elements are involved. Our experimental study demonstrates the ability of MESSIAH to produce superior quality search results.

References

[1]
Z. Bao, J. Lu, T. W. Ling, and B. Chen. Towards an effective xml keyword search. IEEE TKDE, 22(8):1077--1092, 2010.
[2]
S. Cohen, J. Mamou, Y. Kanza, and Y. Sagiv. Xsearch: A semantic search engine for xml. In VLDB, pages 45--56, 2003.
[3]
R. Goldman and J. Widom. Dataguides: Enabling query formulation and optimization in semistructured databases. In VLDB, 1997.
[4]
L. Kong, R. Gilleron, and A. Lemay. Retrieving meaningful relaxed tightest fragments for xml keyword search. In EDBT, 2009.
[5]
M. Lay. Dblp - some lessons learned. In VLDB, 2009.
[6]
K.-H. Lee, K.-Y. Whang, W.-S. Han, and M.-S. K. 0002. Structural consistency: enabling xml keyword search to eliminate spurious results consistently. VLDB J., 19(4):503--529, 2010.
[7]
J. Li, C. Liu, R. Zhou, and W. Wang. Suggestion of promising result types for xml keyword search. In EDBT, pages 561--572, 2010.
[8]
Y. Li, C. Yu, and H. V. Jagadish. Schema-free xquery. In VLDB, pages 72--83, 2004.
[9]
Z. Liu and Y. Chen. Identifying meaningful return information for xml keyword search. In SIGMOD, pages 329--340, 2007.
[10]
Z. Liu and Y. Chen. Reasoning and identifying relevant matches for xml keyword search. PVLDB, 1(1):921--932, 2008.
[11]
Z. Liu and Y. Chen. Return specification inference and result clustering for keyword search on xml. ACM TODS, 35(2), 2010.
[12]
N. Polyzotis, M. Garofalakis, and Y. Ioannidis. Selectivity estimation for xml twigs. In ICDE, pages 264--275. IEEE, 2004.
[13]
A. Schmidt, F. Waas, M. L. Kersten, M. J. Carey, I. Manolescu, and R. Busse. Xmark: A benchmark for xml data management. In VLDB, pages 974--985, 2002.
[14]
C. Sun, C.-Y. Chan, and A. K. Goenka. Multiway slca-based keyword search in xml data. In WWW, 2007.
[15]
I. Tatarinov, S. Viglas, K. S. Beyer, J. Shanmugasundaram, E. J. Shekita, and C. Zhang. Storing and querying ordered xml using a relational database system. In SIGMOD, pages 204--215, 2002.
[16]
A. Termehchy and M. Winslett. Effective, design independent xml keyword search. In CIKM, pages 107--116, 2009.
[17]
Y. Xu and Y. Papakonstantinou. Efficient keyword search for smallest lcas in xml databases. In SIGMOD, pages 537--538, 2005.
[18]
C. Zhang, J. F. Naughton, D. J. DeWitt, Q. Luo, and G. M. Lohman. On supporting containment queries in relational database management systems. In SIGMOD, pages 425--436, 2001.
[19]
J. Zhou, Z. Bao, W. Wang, T. W. Ling, Z. Chen, X. Lin, and J. Guo. Fast slca and elca computation for xml keyword queries based on set intersection. In ICDE, pages 905--916, 2012.

Cited By

View all

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGMOD '13: Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
June 2013
1322 pages
ISBN:9781450320375
DOI:10.1145/2463676
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 June 2013

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. full slca
  2. missing elements
  3. optionality resilience
  4. xml keyword search

Qualifiers

  • Research-article

Conference

SIGMOD/PODS'13
Sponsor:

Acceptance Rates

SIGMOD '13 Paper Acceptance Rate 76 of 372 submissions, 20%;
Overall Acceptance Rate 785 of 4,003 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)8
  • Downloads (Last 6 weeks)3
Reflects downloads up to 12 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Temporal JSON Keyword SearchProceedings of the ACM on Management of Data10.1145/36549802:3(1-27)Online publication date: 30-May-2024
  • (2018)No-but-semantic-matchWorld Wide Web10.1007/s11280-017-0503-821:5(1223-1257)Online publication date: 1-Sep-2018
  • (2017)ASTERIXProceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3077136.3084137(1317-1320)Online publication date: 7-Aug-2017
  • (2017)Towards heterogeneous keyword searchProceedings of the ACM Turing 50th Celebration Conference - China10.1145/3063955.3064802(1-6)Online publication date: 12-May-2017
  • (2017)A review on XML keyword query processing2017 International Conference on Innovative Mechanisms for Industry Applications (ICIMIA)10.1109/ICIMIA.2017.7975610(238-241)Online publication date: Feb-2017
  • (2017)Effective XML keyword query processing2017 International conference of Electronics, Communication and Aerospace Technology (ICECA)10.1109/ICECA.2017.8203739(523-528)Online publication date: Apr-2017
  • (2016)Top-Down XML Keyword Query ProcessingIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2016.251653628:5(1340-1353)Online publication date: 1-May-2016
  • (2015)Enabling generic keyword search over raw XML data2015 IEEE 31st International Conference on Data Engineering10.1109/ICDE.2015.7113410(1496-1499)Online publication date: Apr-2015
  • (2014)Querying virtual hierarchies using virtual prefix-based numbersProceedings of the 2014 ACM SIGMOD International Conference on Management of Data10.1145/2588555.2610506(791-802)Online publication date: 18-Jun-2014
  • (2014)Group-by and Aggregate Functions in XML Keyword SearchDatabase and Expert Systems Applications10.1007/978-3-319-10073-9_10(105-121)Online publication date: 2014
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media