Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/1645953.1646078acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Language-model-based ranking for queries on RDF-graphs

Published: 02 November 2009 Publication History

Abstract

The success of knowledge-sharing communities like Wikipedia and the advances in automatic information extraction from textual and Web sources have made it possible to build large "knowledge repositories" such as DBpedia, Freebase, and YAGO. These collections can be viewed as graphs of entities and relationships (ER graphs) and can be represented as a set of subject-property-object (SPO) triples in the Semantic-Web data model RDF. Queries can be expressed in the W3C-endorsed SPARQL language or by similarly designed graph-pattern search. However, exact-match query semantics often fall short of satisfying the users' needs by returning too many or too few results. Therefore, IR-style ranking models are crucially needed.
In this paper, we propose a language-model-based approach to ranking the results of exact, relaxed and keyword-augmented graph pattern queries over RDF graphs such as ER graphs. Our method estimates a query model and a set of result-graph models and ranks results based on their Kullback-Leibler divergence with respect to the query model. We demonstrate the effectiveness of our ranking model by a comprehensive user study.

References

[1]
S. Amer-Yahia, N. Koudas, A. Marian, D. Srivastava, and D. Toman. Structure and content scoring for xml. In VLDB, 2005.
[2]
S. Amer-Yahia and M. Lalmas. Xml search: languages, inex and scoring. SIGMOD Record, 35(4), 2006.
[3]
S. Auer, C. Bizer, G. Kobilarov, J. Lehmann, R. Cyganiak, and Z. G. Ives. Dbpedia: A nucleus for a web of open data. In ISWC/ASWC, 2007.
[4]
G. Bhalotia, A. Hulgeri, C. Nakhe, S. Chakrabarti, and S. Sudarshan. Keyword searching and browsing in databases using banks. In ICDE, 2002.
[5]
S. Chaudhuri, G. Das, V. Hristidis, and G. Weikum. Probabilistic information retrieval approach for ranking of database query results. ACM Trans. on Database Syst., 31(3), 2006.
[6]
T. Cheng, X. Yan, and K. C.-C. Chang. Entityrank: Searching entities directly and holistically. In VLDB, 2007.
[7]
W. W. Cohen. Integration of heterogeneous databases without common domains using queries based on textual similarity. In SIGMOD, 1998.
[8]
W. B. Croft and H. S. (Eds.). Special issue on database and information retrieval integration. VLDB J., 17(1), 2008.
[9]
P. DeRose, X. Chai, B. J. Gao, W. Shen, A. Doan, P. Bohannon, and X. Zhu. Building community wikipedias: A machine--human partnership approach. In ICDE, 2008.
[10]
A. Doan, L. Gravano, R. Ramakrishnan, and S. V. (Editors). Special issue on managing information extraction. ACM SIGMOD Record, 37(4), 2008.
[11]
H. Fang and C. Zhai. Probabilistic models for expert finding. In ECIR, 2007.
[12]
Freebase: A social database about things you know and love. http://www.freebase.com.
[13]
K. Golenberg, B. Kimelfeld, and Y. Sagiv. Keyword proximity search in complex data graphs. In SIGMOD, 2008.
[14]
D. Hiemstra. Using Language Models for Information Retrieval. PhD thesis, University of Twente, Enschede, 2001.
[15]
D. Hiemstra. Statistical language models for intelligent XML retrieval. In Intelligent Search on XML Data, 2003.
[16]
V. Hristidis, H. Hwang, and Y. Papakonstantinou. Authority-based keyword search in databases. TODS, 33(1), 2008.
[17]
K. Jarvelin and J. Kekalainen. Ir evaluation methods for retrieving highly relevant documents. In SIGIR, 2000.
[18]
V. Kacholia, S. Pandit, S. Chakrabarti, S. Sudarshan, R. Desai, and H. Karambelkar. Bidirectional expansion for keyword search on graph databases. In VLDB, 2005.
[19]
G. Kasneci, M. Ramanath, M. Sozio, F. M. Suchanek, and G. Weikum. Star: Steiner tree approximation in relationship-graphs. In ICDE, 2009.
[20]
G. Kasneci, F. M. Suchanek, G. Ifrim, M. Ramanath, and G. Weikum. Naga: Searching and ranking knowledge. In ICDE, 2008.
[21]
J. D. Lafferty and C. Zhai. Document language models, query models, and risk minimization for information retrieval. In SIGIR, 2001.
[22]
G. Li, B. Ooi, J. Feng, J. Wang, and L. Zhou. Ease: an effective 3-in-1 keyword search method for unstructured, semistructured and structured data. In SIGMOD, 2008.
[23]
X. Liu and W. B. Croft. Statistical language modeling for information retrieval. In Annual Review of Information Science and Technology 39, 2004.
[24]
T. Neumann and G. Weikum. RDF-3X: a RISC-style engine for RDF. Proceedings of the VLDB Endowment, 1(1):647--659, 2008.
[25]
Z. Nie, Y. Ma, S. Shi, J.-R. Wen, and W.-Y. Ma. Web object retrieval. In WWW, 2007.
[26]
D. Petkova and W. Croft. Hierarchical language models for expert finding in enterprise corpora. Int. J. on AI Tools, 17(1), 2008.
[27]
W3c: Resource description framework (rdf). www.w3.org/RDF/.
[28]
S. Sarawagi. Information extraction. Foundations and Trends in Databases, 2(1), 2008.
[29]
P. Serdyukov and D. Hiemstra. Modeling documents as mixtures of persons for expert finding. In ECIR, 2008.
[30]
W3c: Sparql query language for rdf. www.w3.org/TR/rdf-sparql-query/.
[31]
F. M. Suchanek, G. Kasneci, and G. Weikum. Yago: A large ontology from wikipedia and wordnet. J. Web Sem., 6(3), 2008.
[32]
C. Zhai. Statistical language models for information retrieval: A critical review. Foundations and Trends in IR, 2(3), 2008.
[33]
C. Zhai and J. D. Lafferty. A risk minimization framework for information retrieval. Inf. Process. Manage., 42(1), 2006.
[34]
X. Zhou, J. Gaugaz, W.-T. Balke, and W. Nejdl. Query relaxation using malleable schemas. In SIGMOD, 2007.

Cited By

View all
  • (2023)Near-optimal Steiner tree computation powered by node embeddingsKnowledge and Information Systems10.1007/s10115-023-01893-865:11(4563-4583)Online publication date: 16-May-2023
  • (2019)Answering why-not questions on SPARQL queriesKnowledge and Information Systems10.1007/s10115-018-1155-458:1(169-208)Online publication date: 1-Jan-2019
  • (2018)Entity Retrieval in the Knowledge Graph with Hierarchical Entity Type and ContentProceedings of the 2018 ACM SIGIR International Conference on Theory of Information Retrieval10.1145/3234944.3234963(211-214)Online publication date: 10-Sep-2018
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
CIKM '09: Proceedings of the 18th ACM conference on Information and knowledge management
November 2009
2162 pages
ISBN:9781605585123
DOI:10.1145/1645953
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 02 November 2009

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. RDF
  2. entity
  3. language
  4. model
  5. ranking
  6. relationship
  7. search
  8. semantic

Qualifiers

  • Research-article

Conference

CIKM '09
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)13
  • Downloads (Last 6 weeks)1
Reflects downloads up to 25 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2023)Near-optimal Steiner tree computation powered by node embeddingsKnowledge and Information Systems10.1007/s10115-023-01893-865:11(4563-4583)Online publication date: 16-May-2023
  • (2019)Answering why-not questions on SPARQL queriesKnowledge and Information Systems10.1007/s10115-018-1155-458:1(169-208)Online publication date: 1-Jan-2019
  • (2018)Entity Retrieval in the Knowledge Graph with Hierarchical Entity Type and ContentProceedings of the 2018 ACM SIGIR International Conference on Theory of Information Retrieval10.1145/3234944.3234963(211-214)Online publication date: 10-Sep-2018
  • (2018)Towards Annotating Relational Data on the Web with Language ModelsProceedings of the 2018 World Wide Web Conference10.1145/3178876.3186029(1307-1316)Online publication date: 10-Apr-2018
  • (2018)Applications of Flexible Querying to Graph DataGraph Data Management10.1007/978-3-319-96193-4_4(97-142)Online publication date: 1-Nov-2018
  • (2018)Template-Based SPARQL Query and Visualization on Knowledge GraphsDatabase Systems for Advanced Applications10.1007/978-3-319-91455-8_17(184-200)Online publication date: 12-May-2018
  • (2018)Towards Empty Answers in SPARQL: Approximating Querying with RDF EmbeddingThe Semantic Web – ISWC 201810.1007/978-3-030-00671-6_30(513-529)Online publication date: 18-Sep-2018
  • (2017)Relaxation of keyword pattern graphs on RDF DataJournal of Web Engineering10.5555/3177589.317759116:5-6(363-398)Online publication date: 1-Sep-2017
  • (2017)Relaxing Graph Pattern Matching With ExplanationsProceedings of the 2017 ACM on Conference on Information and Knowledge Management10.1145/3132847.3132992(1677-1686)Online publication date: 6-Nov-2017
  • (2017)Intent-Aware Semantic Query AnnotationProceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3077136.3080825(485-494)Online publication date: 7-Aug-2017
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media