Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/1989323.1989383acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

Keyword search over relational databases: a metadata approach

Published: 12 June 2011 Publication History

Abstract

Keyword queries offer a convenient alternative to traditional SQL in querying relational databases with large, often unknown, schemas and instances. The challenge in answering such queries is to discover their intended semantics, construct the SQL queries that describe them and used them to retrieve the respective tuples. Existing approaches typically rely on indices built a-priori on the database content. This seriously limits their applicability if a-priori access to the database content is not possible. Examples include the on-line databases accessed through web interface, or the sources in information integration systems that operate behind wrappers with specific query capabilities. Furthermore, existing literature has not studied to its full extend the inter-dependencies across the ways the different keywords are mapped into the database values and schema elements. In this work, we describe a novel technique for translating keyword queries into SQL based on the Munkres (a.k.a. Hungarian) algorithm. Our approach not only tackles the above two limitations, but it offers significant improvements in the identification of the semantically meaningful SQL queries that describe the intended keyword query semantics. We provide details of the technique implementation and an extensive experimental evaluation.

References

[1]
B. Aditya, G. Bhalotia, S. Chakrabarti, A. Hulgeri, C. Nakhe, Parag, and S. Sudarshan. Banks: Browsing and keyword searching in relational databases. In VLDB, pages 1083--1086, 2002.
[2]
S. Agrawal, S. Chaudhuri, and G. Das. Dbxplorer: A system for keyword-based search over relational databases. In ICDE, pages 5--16. IEEE Computer Society, 2002.
[3]
S. Amer-Yahia, L. V. S. Lakshmanan, and S. Pandit. FleXPath: Flexible structure and full-text querying for XML. In SIGMOD, pages 83--94. ACM, 2004.
[4]
S. Bergamaschi, E. Domnori, F. Guerra, M. Orsini, R. T. Lado, and Y. Velegrakis. Keymantic: Semantic keyword-based searching in data integration systems. PVLDB, 3(2):1637--1640, 2010.
[5]
S. Bergamaschi, C. Sartori, F. Guerra, and M. Orsini. Extracting relevant attribute values for improved search. IEEE Internet Computing, 11(5):26--35, 2007.
[6]
J. Bleiholder and F. Naumann. Data fusion. ACM Comput. Surv., 41(1), 2008.
[7]
F. Bourgeois and J.-C. Lassalle. An extension of the Munkres algorithm for the assignment problem to rectangular matrices. Communications of ACM, 14(12):802--804, 1971.
[8]
S. Brin and L. Page. The anatomy of a large-scale hypertextual web search engine. Computer Networks, 30(1-7):107--117, 1998.
[9]
R. Burkard, M. Dell'Amico, and S. Martello. Assignment Problems. SIAM Society for Industrial and Applied Mathematics, 2009.
[10]
S. Chakrabarti, S. Sarawagi, and S. Sudarshan. Enhancing search with structure. IEEE Data Eng. Bull., 33(1):3--24, 2010.
[11]
R. Cilibrasi and P. M. B. Vitányi. The google similarity distance. IEEE TKDE, 19(3):370--383, 2007.
[12]
W. W. Cohen, P. D. Ravikumar, and S. E. Fienberg. A comparison of string distance metrics for name-matching tasks. In IIWeb, 2003.
[13]
D. Florescu, D. Kossmann, and I. Manolescu. Integrating keyword search into xml query processing. In BDA, 2000.
[14]
S. Guha, H. V. Jagadish, N. Koudas, D. Srivastava, and T. Yu. Approximate XML joins. In SIGMOD, pages 287--298, 2002.
[15]
S. Helmer. Measuring the Structural Similarity of Semistructured Documents Using Entropy. In VLDB, pages 1022--1032. ACM, 2007.
[16]
V. Hristidis and Y. Papakonstantinou. Discover: Keyword search in relational databases. In VLDB, pages 670--681, 2002.
[17]
Y. Kotidis, A. Marian, and D. Srivastava. Circumventing Data Quality Problems Using Multiple Join Paths. In CleanDB, 2006.
[18]
R. Kumar and A. Tomkins. A Characterization of Online Search Behavior. IEEE Data Engineering Bulletin, 32(2):3--11, 2009.
[19]
M. Lenzerini. Data integration: A theoretical perspective. In PODS, pages 233--246. ACM, 2002.
[20]
Y. Li, C. Yu, and H. V. Jagadish. Schema-free XQuery. In VLDB, pages 72--83, 2004.
[21]
F. Liu, C. T. Yu, W. Meng, and A. Chowdhury. Effective keyword search in relational databases. In SIGMOD, pages 563--574, 2006.
[22]
Y. Luo, X. Lin, W. Wang, and X. Zhou. Spark: top-k keyword query in relational databases. In SIGMOD, pages 115--126. ACM, 2007.
[23]
D. Maier, J. D. Ullman, and M. Y. Vardi. On the Foundations of the Universal Relation Model. ACM Trans. Database Syst., 9(2):283--308, June 1984.
[24]
S. Melnik, H. Garcia-Molina, and E. Rahm. Similarity flooding: A versatile graph matching algorithm and its application to schema matching. In ICDE, pages 117--128. IEEE Computer Society, 2002.
[25]
A. Nandi and H. V. Jagadish. Assisted querying using instant-response interfaces. In SIGMOD, pages 1156--1158. ACM, 2007.
[26]
L. Popa, Y. Velegrakis, R. J. Miller, M. A. Hernandez, and R. Fagin. Translating web data. In VLDB, pages 598--609, 2002.
[27]
L. Qin, J. X. Yu, and L. Chang. Keyword search in databases: the power of rdbms. In SIGMOD, pages 681--694. ACM, 2009.
[28]
E. Rahm and P. A. Bernstein. A survey of approaches to automatic schema matching. VLDB Journal, 10(4):334--350, 2001.
[29]
A. Simitsis, G. Koutrika, and Y. E. Ioannidis. Précis: from unstructured keywords as queries to structured databases as answers. VLDB Journal, 17(1):117--149, 2008.
[30]
A. Singhal, C. Buckley, and M. Mitra. Pivoted document length normalization. In SIGIR, pages 21--29, 1996.
[31]
S. Tata and G. M. Lohman. SQAK: doing more with keywords. In SIGMOD, pages 889--902. ACM, 2008.
[32]
M. Theobald, H. Bast, D. Majumdar, R. Schenkel, and G. Weikum. TopX: efficient and versatile top-k query processing for semistructured data. VLDB Journal, 17(1):81--115, 2008.
[33]
T. Tran, H. Wang, S. Rudolph, and P. Cimiano. Top-k exploration of query candidates for efficient keyword search on graph-shaped (rdf) data. In ICDE, pages 405--416. IEEE Computer Society, 2009.
[34]
V. S. Uren, Y. Lei, and E. Motta. Semsearch: Refining semantic search. In ESWC, pages 874--878. LNCS Springer, 2008.
[35]
W. Webber. Evaluating the effectiveness of keyword search. IEEE Data Engineering Bulletin, 33(1):55--60, 2010.
[36]
J. X. Yu, L. Qin, and L. Chang. Keyword Search in Databases. Synthesis Lectures on Data Management. Morgan & Claypool Publishers, 2010.
[37]
G. Zenz, X. Zhou, E. Minack, W. Siberski, and W. Nejdl. From keywords to semantic queries-incremental query construction on the semantic web. Journal of Web Semantics, 7(3):166--176, 2009.
[38]
Q. Zhou, C. Wang, M. Xiong, H. Wang, and Y. Yu. Spark: Adapting keyword query to semantic search. In ISWC, pages 694--707, 2007.

Cited By

View all
  • (2023)A Variational Neural Architecture for Skill-based Team FormationACM Transactions on Information Systems10.1145/358976242:1(1-28)Online publication date: 18-Aug-2023
  • (2023)A Model of Contextual Factors Affecting Older Adults’ Information-Sharing Decisions in the U.S.ACM Transactions on Computer-Human Interaction10.1145/355788830:1(1-48)Online publication date: 4-Apr-2023
  • (2023)Historically Informed HCI: Reflecting on Contemporary Technology through Anachronistic FictionACM Transactions on Computer-Human Interaction10.1145/351714429:6(1-39)Online publication date: 4-Apr-2023
  • Show More Cited By

Index Terms

  1. Keyword search over relational databases: a metadata approach

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      SIGMOD '11: Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
      June 2011
      1364 pages
      ISBN:9781450306614
      DOI:10.1145/1989323
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 12 June 2011

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. intensional knowledge
      2. metadata
      3. relational databases
      4. semantic keyword search

      Qualifiers

      • Research-article

      Conference

      SIGMOD/PODS '11
      Sponsor:

      Acceptance Rates

      Overall Acceptance Rate 785 of 4,003 submissions, 20%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)19
      • Downloads (Last 6 weeks)3
      Reflects downloads up to 13 Nov 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2023)A Variational Neural Architecture for Skill-based Team FormationACM Transactions on Information Systems10.1145/358976242:1(1-28)Online publication date: 18-Aug-2023
      • (2023)A Model of Contextual Factors Affecting Older Adults’ Information-Sharing Decisions in the U.S.ACM Transactions on Computer-Human Interaction10.1145/355788830:1(1-48)Online publication date: 4-Apr-2023
      • (2023)Historically Informed HCI: Reflecting on Contemporary Technology through Anachronistic FictionACM Transactions on Computer-Human Interaction10.1145/351714429:6(1-39)Online publication date: 4-Apr-2023
      • (2023)PyLatheDB - A Library for Relational Keyword Search with Support to Schema References2023 IEEE 39th International Conference on Data Engineering (ICDE)10.1109/ICDE55515.2023.00284(3627-3630)Online publication date: Apr-2023
      • (2023)Supporting Schema References in Keyword Queries Over Relational DatabasesIEEE Access10.1109/ACCESS.2023.330890811(92365-92390)Online publication date: 2023
      • (2022)Data-Driven Web APIs Recommendation for Building Web ApplicationsIEEE Transactions on Big Data10.1109/TBDATA.2020.29755878:3(685-698)Online publication date: 1-Jun-2022
      • (2022)Neural Network Accelerated Tuple Search For Relational Data2022 IEEE 23rd International Conference on Information Reuse and Integration for Data Science (IRI)10.1109/IRI54793.2022.00029(81-82)Online publication date: Aug-2022
      • (2021)Keyword search over schema-less RDF datasets by SPARQL query compilationInformation Systems10.1016/j.is.2021.101814102:COnline publication date: 1-Dec-2021
      • (2021)Quantum-Inspired Keyword Search on Multi-model DatabasesDatabase Systems for Advanced Applications10.1007/978-3-030-73197-7_39(585-602)Online publication date: 6-Apr-2021
      • (2020)How the Quantum-inspired Framework Supports Keyword Searches on Multi-model DatabasesProceedings of the 29th ACM International Conference on Information & Knowledge Management10.1145/3340531.3418508(3257-3260)Online publication date: 19-Oct-2020
      • Show More Cited By

      View Options

      Get Access

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media