research-article

Free access

PSLOG: Pretraining with Search Logs for Document Ranking

Authors:

Zhan Su,

Zhicheng Dou,

Yujia Zhou,

Ziyuan Zhao,

Ji-Rong WenAuthors Info & Claims

KDD '23: Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

Pages 2072 - 2082

https://doi.org/10.1145/3580305.3599477

Published: 04 August 2023 Publication History

PDF eReader

Abstract

Recently, pretrained models have achieved remarkable performance not only in natural language processing but also in information retrieval (IR). Previous studies show that IR-oriented pretraining tasks can achieve better performance than only finetuning pretrained language models in IR datasets. Besides, the massive search log data obtained from mainstream search engines can be used in IR pretraining, for it contains users' implicit judgments of document relevance under a concrete query. However, existing methods mainly use direct query-document click signals to pretrain models. The potential supervision signals from search logs are far from being well explored. In this paper, we propose to comprehensively leverage four query-document relevance relations, including co-interaction and multi-hop relations, to pretrain ranking models in IR. Specifically, we focus on the user's click behavior and construct an Interaction Graph to represent the global relevance relations between queries and documents from all search logs. With the graph, we can consider the co-interaction and multi-hop q-d relationships through their neighbor nodes. Based on the relations extracted from the interaction graph, we propose four strategies to generate contrastive positive and negative q-d pairs and use these data to pretrain ranking models. Experimental results on both industrial and academic datasets demonstrate the effectiveness of our method.

Supplementary Material

MP4 File (1296-2min-promo.mp4)

Presentation video

Download
2.32 MB

MP4 File (1296-2min-promo.mp4)

Presentation video

Download
2.32 MB

References

[1]

Eugene Agichtein, Eric Brill, and Susan Dumais. 2006 a. Improving Web Search Ranking by Incorporating User Behavior Information. In Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (Seattle, Washington, USA) (SIGIR '06). Association for Computing Machinery, New York, NY, USA, 19--26. https://doi.org/10.1145/1148170.1148177

Abstract

Supplementary Material

References

Index Terms

Recommendations

Re-ranking search results using query logs

OLAP on search logs: an infrastructure supporting data-driven applications in search engines

Improving Ranking Consistency for Web Search by Leveraging a Knowledge Base and Search Logs

Comments

Information

Published In

Sponsors

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Funding Sources

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

View options

PDF

eReader

Login options

Full Access

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations