Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/2806416.2806456acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

EsdRank: Connecting Query and Documents through External Semi-Structured Data

Published: 17 October 2015 Publication History

Abstract

This paper presents EsdRank, a new technique for improving ranking using external semi-structured data such as controlled vocabularies and knowledge bases. EsdRank treats vocabularies, terms and entities from external data, as objects connecting query and documents. Evidence used to link query to objects, and to rank documents are incorporated as features between query-object and object-document correspondingly. A latent listwise learning to rank algorithm, Latent-ListMLE, models the objects as latent space between query and documents, and learns how to handle all evidence in a unified procedure from document relevance judgments. EsdRank is tested in two scenarios: Using a knowledge base for web search, and using a controlled vocabulary for medical search. Experiments on TREC Web Track and OHSUMED data show significant improvements over state-of-the-art baselines.

References

[1]
M. Bendersky, D. Metzler, and W. B. Croft. Effective query formulation with multiple information sources. In Proceedings of the fifth ACM International Conference on Web Search and Data Mining, pages 443--452. ACM, 2012.
[2]
A. Berger and J. Lafferty. Information retrieval as statistical translation. In Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 222--229. ACM, 1999.
[3]
K. Bollacker, C. Evans, P. Paritosh, T. Sturge, and J. Taylor. Freebase: a collaboratively created graph database for structuring human knowledge. In Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, pages 1247--1250. ACM, 2008.
[4]
W. C. Brandão, R. L. Santos, N. Ziviani, E. S. Moura, and A. S. Silva. Learning to expand queries using entities. Journal of the Association for Information Science and Technology (JASIST), 65(9):1870--1883, 2014.
[5]
A. Z. Broder, M. Fontoura, E. Gabrilovich, A. Joshi, V. Josifovski, and T. Zhang. Robust classification of rare queries using web knowledge. In Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 231--238. ACM, 2007.
[6]
A. Carlson, J. Betteridge, B. Kisiel, B. Settles, E. R. Hruschka Jr, and T. M. Mitchell. Toward an architecture for never-ending language learning. In AAAI, volume 5, page 3, 2010.
[7]
D. Carmel, M.-W. Chang, E. Gabrilovich, B.-J. P. Hsu, and K. Wang. ERD'14: Entity recognition and disambiguation challenge. In SIGIR '14: Proceedings of the 37th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 2014.
[8]
G. V. Cormack, M. D. Smucker, and C. L. Clarke. Efficient and effective spam filtering and re-ranking for large web datasets. Information Retrieval, 14(5):441--465, 2011.
[9]
W. B. Croft, D. Metzler, and T. Strohman. Search engines: Information Retrieval in practice. Addison-Wesley Reading, 2010.
[10]
J. Dalton, L. Dietz, and J. Allan. Entity query feature expansion using knowledge base links. In Proceedings of the 37th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 365--374. ACM, 2014.
[11]
L. Dietz and P. Verga. Umass at TREC 2014: Entity query feature expansion using knowledge base links. In Proceedings of The 23st Text Retrieval Conference, page To Appear. NIST, 2014.
[12]
P. Ferragina and U. Scaiella. Fast and accurate annotation of short texts with wikipedia pages. arXiv preprint arXiv:1006.3498, 2010.
[13]
E. Gabrilovich, M. Ringgaard, and A. Subramanya. FACC1: Freebase annotation of ClueWeb corpora, Version 1 (Release date 2013-06-26, Format version 1, Correction level 0), June 2013.
[14]
G. Grefenstette and L. Wilber. Search-Based Applications: At the Confluence of Search and Database Technologies. 2010.
[15]
T. Joachims. Making large-scale svm learning practical. LS8-Report 24, Universität Dortmund, LS VIII-Report, 1998.
[16]
T. Joachims. Optimizing search engines using clickthrough data. In Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 133--142. ACM, 2002.
[17]
V. Lavrenko and W. B. Croft. Relevance based language models. In Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 120--127. ACM, 2001.
[18]
H. Li and J. Xu. Semantic matching in search. Foundations and Trends in Information Retrieval, 8:89, 2014.
[19]
T.-Y. Liu. Learning to rank for information retrieval. Foundations and Trends in Information Retrieval, 3(3):225--331, 2009.
[20]
X. Liu, P. Yang, and H. Fang. Entity came to rescue - leveraging entities to minimize risks in web search. In Proceedings of The 23st Text Retrieval Conference, (TREC 2014), page To Appear. NIST, 2014.
[21]
Y. Lu, H. Fang, and C. Zhai. An empirical study of gene synonym query expansion in biomedical information retrieval. Information Retrieval Journal, 12(1):51--68, 2009.
[22]
Z. Lu, W. Kim, and W. J. Wilbur. Evaluation of query expansion using mesh in . Information Retrieval Journal, 12(1):69--80, 2009.
[23]
D. Metzler and W. B. Croft. Latent concept expansion using markov random fields. In Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 311--318. ACM, 2007.
[24]
D. Pan, P. Zhang, J. Li, D. Song, J.-R. Wen, Y. Hou, B. Hu, Y. Jia, and A. De Roeck. Using dempster-shafer's evidence theory for query expansion based on freebase knowledge. In Information Retrieval Technology, pages 121--132. Springer, 2013.
[25]
T. Rajashekar and B. W. Croft. Combining automatic and manual index representations in probabilistic retrieval. Journal of the American society for Information science, 46(4):272--283, 1995.
[26]
N. Stokes, Y. Li, L. Cavedon, and J. Zobel. Exploring criteria for successful query expansion in the genomic domain. Information Retrieval, 12(1):17--50, 2009.
[27]
F. Xia, T.-Y. Liu, J. Wang, W. Zhang, and H. Li. Listwise approach to learning to rank: theory and algorithm. In Proceedings of the 25th International Conference on Machine learning, pages 1192--1199. ACM, 2008.
[28]
C. Xiong and J. Callan. Query expansion with Freebase. In Proceedings of the fifth ACM International Conference on the Theory of Information Retrieval. ACM, 2015. To appear.
[29]
Y. Xu, G. J. Jones, and B. Wang. Query dependent pseudo-relevance feedback based on wikipedia. In Proceedings of the 32nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 59--66. ACM, 2009.

Cited By

View all
  • (2024)DREQ: Document Re-ranking Using Entity-Based Query UnderstandingAdvances in Information Retrieval10.1007/978-3-031-56027-9_13(210-229)Online publication date: 24-Mar-2024
  • (2023)Towards Sequential Counterfactual Learning to RankProceedings of the Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in the Asia Pacific Region10.1145/3624918.3625325(122-128)Online publication date: 26-Nov-2023
  • (2022)Predicting Guiding Entities for Entity Aspect LinkingProceedings of the 31st ACM International Conference on Information & Knowledge Management10.1145/3511808.3557671(3848-3852)Online publication date: 17-Oct-2022
  • Show More Cited By

Index Terms

  1. EsdRank: Connecting Query and Documents through External Semi-Structured Data

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    CIKM '15: Proceedings of the 24th ACM International on Conference on Information and Knowledge Management
    October 2015
    1998 pages
    ISBN:9781450337946
    DOI:10.1145/2806416
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 17 October 2015

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. controlled vocabulary
    2. freebase
    3. knowledge base
    4. learning to rank
    5. mesh
    6. ranking
    7. semi-structured data

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    CIKM'15
    Sponsor:

    Acceptance Rates

    CIKM '15 Paper Acceptance Rate 165 of 646 submissions, 26%;
    Overall Acceptance Rate 1,466 of 6,316 submissions, 23%

    Upcoming Conference

    CIKM '25

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)15
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 27 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)DREQ: Document Re-ranking Using Entity-Based Query UnderstandingAdvances in Information Retrieval10.1007/978-3-031-56027-9_13(210-229)Online publication date: 24-Mar-2024
    • (2023)Towards Sequential Counterfactual Learning to RankProceedings of the Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in the Asia Pacific Region10.1145/3624918.3625325(122-128)Online publication date: 26-Nov-2023
    • (2022)Predicting Guiding Entities for Entity Aspect LinkingProceedings of the 31st ACM International Conference on Information & Knowledge Management10.1145/3511808.3557671(3848-3852)Online publication date: 17-Oct-2022
    • (2022)Early Stage Sparse Retrieval with Entity LinkingProceedings of the 31st ACM International Conference on Information & Knowledge Management10.1145/3511808.3557588(4464-4469)Online publication date: 17-Oct-2022
    • (2022)AI-enabled Project Initiation: An approach based on RFP Response DocumentProceedings of the 15th Innovations in Software Engineering Conference10.1145/3511430.3511450(1-5)Online publication date: 24-Feb-2022
    • (2022)Query Interpretations from Entity-Linked SegmentationsProceedings of the Fifteenth ACM International Conference on Web Search and Data Mining10.1145/3488560.3498532(449-457)Online publication date: 11-Feb-2022
    • (2022)Global Graph Attention Embedding Network for Relation Prediction in Knowledge GraphsIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2021.308325933:11(6712-6725)Online publication date: Nov-2022
    • (2022)Knowledge Graph-Based Semantic Ranking for Efficient Semantic Query2022 IEEE 10th International Conference on Computer Science and Network Technology (ICCSNT)10.1109/ICCSNT56096.2022.9972953(75-79)Online publication date: 22-Oct-2022
    • (2022)ParaGraph: Mapping Wikidata Tail Entities to Wikipedia Paragraphs2022 IEEE International Conference on Big Data (Big Data)10.1109/BigData55660.2022.10020207(6008-6017)Online publication date: 17-Dec-2022
    • (2022)Augmenting Graph Inductive Learning Model with Topographical FeaturesComputational Science – ICCS 202210.1007/978-3-031-08757-8_60(728-741)Online publication date: 21-Jun-2022
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media