Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/1807085.1807121acmconferencesArticle/Chapter ViewAbstractPublication PagespodsConference Proceedingsconference-collections
research-article

Understanding queries in a search database system

Published: 06 June 2010 Publication History

Abstract

It is well known that a search engine can significantly benefit from an auxiliary database, which can suggest interpretations of the search query by means of the involved concepts and their interrelationship. The difficulty is to translate abstract notions like concept and interpretation into a concrete search algorithm that operates over the auxiliary database. To surpass existing heuristics, there is a need for a formal basis, which is realized in this paper through the framework of a search database system, where an interpretation is identified as a parse. It is shown that the parses of a query can be generated in polynomial time in the combined size of the input and the output, even if parses are restricted to those having a nonempty evaluation. Identifying that one parse is more specific than another is important for ranking answers, and this framework captures the precise semantics of being more specific; moreover, performing this comparison between parses is tractable. Lastly, the paper studies the problem of finding the most specific parses. Unfortunately, this problem turns out to be intractable in the general case. However, under reasonable assumptions, the parses can be enumerated in an order of decreasing specificity, with polynomial delay and polynomial space.

References

[1]
B. Aditya, G. Bhalotia, S. Chakrabarti, A. Hulgeri, C. Nakhe, Parag, and S. Sudarshan. BANKS: Browsing and keyword searching in relational databases. In VLDB, pages 1083--1086. Morgan Kaufmann, 2002.
[2]
S. Amer-Yahia, S. Cho, L. V. S. Lakshmanan, and D. Srivastava. Tree pattern query minimization. VLDB J., 11(4):315--331, 2002.
[3]
J. Bear, D. J. Israel, J. Petit, and D. L. Martin. Using information extraction to improve document retrieval. In TREC, pages 367--377, 1997.
[4]
A. Z. Broder. A taxonomy of Web search. SIGIR Forum, 36(2):3--10, 2002.
[5]
Y. Chen, W. Wang, Z. Liu, and X. Lin. Keyword search on structured and semi-structured data. In SIGMOD Conference, pages 1005--1010. ACM, 2009.
[6]
M. R. Garey and D. S. Johnson. Two-processor scheduling with start-times and deadlines. SIAM J. Comput., 6(3):416--426, 1977.
[7]
M. R. Garey and D. S. Johnson. "Strong" NP-completeness results: Motivation, examples, and implications. J. ACM, 25(3):499--508, 1978.
[8]
K. Golenberg, B. Kimelfeld, and Y. Sagiv. Keyword proximity search in complex data graphs. In SIGMOD Conference, pages 927--940. ACM, 2008.
[9]
M. A. Hearst. Direction-based text interpretation as an information access refinement. In P. S. Jacobs, editor, Text-Based Intelligent Systems: Current Research and Practice in Information Extraction and Retrieval, pages 257--274. Erlbaum, Hillsdale, 1992.
[10]
V. Hristidis and Y. Papakonstantinou. DISCOVER: Keyword search in relational databases. In VLDB, pages 670--681. Morgan Kaufmann, 2002.
[11]
D. Johnson, M. Yannakakis, and C. Papadimitriou. On generating all maximal independent sets. Information Processing Letters, 27:119--123, 1988.
[12]
V. Kacholia, S. Pandit, S. Chakrabarti, S. Sudarshan, R. Desai, and H. Karambelkar. Bidirectional expansion for keyword search on graph databases. In VLDB, pages 505--516. ACM, 2005.
[13]
E. Kandogan, R. Krishnamurthy, S. Raghavan, S. Vaithyanathan, and H. Zhu. Avatar semantic search: a database approach to information retrieval. In SIGMOD Conference, pages 790--792. ACM, 2006.
[14]
B. Kimelfeld and Y. Sagiv. Finding and approximating top-k answers in keyword proximity search. In PODS, pages 173--182. ACM, 2006.
[15]
B. Kimelfeld and Y. Sagiv. Efficiently enumerating results of keyword search over data graphs. Inf. Syst., 33(4-5):335--359, 2008.
[16]
B. Klimt and Y. Yang. Introducing the Enron corpus. In CEAS, 2004.
[17]
R. Krishnamurthy, Y. Li, S. Raghavan, F. Reiss, S. Vaithyanathan, and H. Zhu. SystemT: a system for declarative information extraction. SIGMOD Record, 37(4):7--13, 2008.
[18]
E. L. Lawler. A procedure for computing the k best solutions to discrete optimization problems and its application to the shortest path problem. Management Science, 18:401--405, 1972.
[19]
D. D. Lewis. Text representation for intelligent text retrieval: A classification--oriented view. In P. S. Jacobs, editor, Text-Based Intelligent Systems: Current Research and Practice in Information Extraction and Retrieval, pages 179--197. Erlbaum, Hillsdale, 1992.
[20]
Y. Li, R. Krishnamurthy, S.Vaithyanathan, and H. V. Jagadish. Getting work done on the Web: supporting transactional queries. In SIGIR, pages 557--564. ACM, 2006.
[21]
Y. Luo, W. Wang, and X. Lin. SPARK: A keyword search engine on relational databases. In ICDE, pages 1552--1555. IEEE, 2008.
[22]
K. G. Murty. An algorithm for ranking all the assignments in order of increasing costs. Operations Research, 16:682--687, 1968.
[23]
B. Pang and L. Lee. Using very simple statistics for review search: An exploration. In Proceedings of COLING: Companion volume: Posters, pages 73--76, 2008.
[24]
G. Petasis, V. Karkaletsis, G. Paliouras, and C. D. Spyropoulos. Learning context-free grammars to extract relations from text. In ECAI, volume 178 of Frontiers in Artificial Intelligence and Applications, pages 303--307. IOS Press, 2008.
[25]
L. Qin, J. X. Yu, and L. Chang. Keyword search in databases: the power of RDBMS. In SIGMOD Conference, pages 681--694. ACM, 2009.
[26]
Y. Qiu and H.-P. Frei. Concept based query expansion. In SIGIR, pages 160--169. ACM, 1993.
[27]
F. Reiss, S. Raghavan, R. Krishnamurthy, H. Zhu, and S. Vaithyanathan. An algebraic approach to rule-based information extraction. In ICDE, pages 933--942. IEEE, 2008.
[28]
K. Sparck Jones. Assumptions and issues in text-based retrieval. In P. S. Jacobs, editor, Text-Based Intelligent Systems: Current Research and Practice in Information Extraction and Retrieval, pages 157--177. Erlbaum, Hillsdale, 1992.
[29]
J. Y. Yen. Finding the k shortest loopless paths in a network. Management Science, 17:712--716, 1971.
[30]
H. Zhu, S. Raghavan, S. Vaithyanathan, and A. Löser. Navigating the intranet with high precision. In WWW, pages 491--500. ACM, 2007.

Cited By

View all
  • (2019)Automated Resume Evaluation System using NLP2019 International Conference on Advances in Computing, Communication and Control (ICAC3)10.1109/ICAC347590.2019.9036842(1-4)Online publication date: Dec-2019
  • (2016)Explicit Query Interpretation and Diversification for Context-Driven Concept Search Across OntologiesThe Semantic Web – ISWC 201610.1007/978-3-319-46523-4_17(271-288)Online publication date: 17-Oct-2016
  • (2015)Cost-Effective Conceptual Design for Information ExtractionACM Transactions on Database Systems10.1145/271632140:2(1-39)Online publication date: 30-Jun-2015
  • Show More Cited By

Index Terms

  1. Understanding queries in a search database system

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    PODS '10: Proceedings of the twenty-ninth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
    June 2010
    350 pages
    ISBN:9781450300339
    DOI:10.1145/1807085
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 06 June 2010

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. auxiliary database
    2. search database system

    Qualifiers

    • Research-article

    Conference

    SIGMOD/PODS '10
    Sponsor:
    SIGMOD/PODS '10: International Conference on Management of Data
    June 6 - 11, 2010
    Indiana, Indianapolis, USA

    Acceptance Rates

    PODS '10 Paper Acceptance Rate 27 of 113 submissions, 24%;
    Overall Acceptance Rate 642 of 2,707 submissions, 24%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)15
    • Downloads (Last 6 weeks)2
    Reflects downloads up to 22 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2019)Automated Resume Evaluation System using NLP2019 International Conference on Advances in Computing, Communication and Control (ICAC3)10.1109/ICAC347590.2019.9036842(1-4)Online publication date: Dec-2019
    • (2016)Explicit Query Interpretation and Diversification for Context-Driven Concept Search Across OntologiesThe Semantic Web – ISWC 201610.1007/978-3-319-46523-4_17(271-288)Online publication date: 17-Oct-2016
    • (2015)Cost-Effective Conceptual Design for Information ExtractionACM Transactions on Database Systems10.1145/271632140:2(1-39)Online publication date: 30-Jun-2015
    • (2013)Efficient parsing-based search over structured dataProceedings of the 22nd ACM international conference on Information & Knowledge Management10.1145/2505515.2505764(49-58)Online publication date: 27-Oct-2013
    • (2013)Learning joint query interpretation and response rankingProceedings of the 22nd international conference on World Wide Web10.1145/2488388.2488484(1099-1110)Online publication date: 13-May-2013
    • (2012)Predicting the effectiveness of keyword queries on databasesProceedings of the 21st ACM international conference on Information and knowledge management10.1145/2396761.2398422(1213-1222)Online publication date: 29-Oct-2012
    • (2012)Interpreting keyword queries over web knowledge basesProceedings of the 21st ACM international conference on Information and knowledge management10.1145/2396761.2396803(305-314)Online publication date: 29-Oct-2012
    • (2012)Automatic suggestion of query-rewrite rules for enterprise searchProceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval10.1145/2348283.2348363(591-600)Online publication date: 12-Aug-2012
    • (2012)Entity Synonyms for Structured Web SearchIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2011.16824:10(1862-1875)Online publication date: 1-Oct-2012
    • (2011)Serving information needs in business process consultingProceedings of the 9th international conference on Business process management10.5555/2040283.2040305(231-247)Online publication date: 30-Aug-2011
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media