short-paper

MarlRank: Multi-agent Reinforced Learning to Rank

Authors:

Mohammad Akbari,

Peng ZhangAuthors Info & Claims

CIKM '19: Proceedings of the 28th ACM International Conference on Information and Knowledge Management

Pages 2073 - 2076

https://doi.org/10.1145/3357384.3358075

Published: 03 November 2019 Publication History

Abstract

When estimating the relevancy between a query and a document, ranking models largely neglect the mutual information among documents. A common wisdom is that if two documents are similar in terms of the same query, they are more likely to have similar relevance score. To mitigate this problem, in this paper, we propose a multi-agent reinforced ranking model, named MarlRank. In particular, by considering each document as an agent, we formulate the ranking process as a multi-agent Markov Decision Process (MDP), where the mutual interactions among documents are incorporated in the ranking process. To compute the ranking list, each document predicts its relevance to a query considering not only its own query-document features but also its similar documents' features and actions. By defining reward as a function of NDCG, we can optimize our model directly on the ranking performance measure. Our experimental results on two LETOR benchmark datasets show that our model has significant performance gains over the state-of-art baselines. We also find that the NDCG shows an overall increasing trend along with the step of interactions, which demonstrates that the mutual information among documents helps improve the ranking performance.

References

[1]

Chris Burges, Tal Shaked, Erin Renshaw, Ari Lazier, Matt Deeds, Nicole Hamilton, and Greg Hullender. 2005. Learning to rank using gradient descent. In ICML .

[2]

Chris J.C. Burges. 2010. From RankNet to LambdaRank to LambdaMART: An Overview.

[3]

Christopher J Burges, Robert Ragno, and Quoc V Le. 2007. Learning to rank with nonsmooth cost functions. In NeurIPS .

[4]

Zhe Cao, Tao Qin, Tie-Yan Liu, Ming-Feng Tsai, and Hang Li. 2007. Learning to Rank: From Pairwise Approach to Listwise Approach. In ICML .

[5]

Soumen Chakrabarti, Rajiv Khanna, Uma Sawant, and Chiru Bhattacharyya. 2008. Structured Learning for Non-smooth Ranking Losses. In KDD .

[6]

Yue Feng, Jun Xu, Yanyan Lan, Jiafeng Guo, Wei Zeng, and Xueqi Cheng. 2018. From Greedy Selection to Exploratory Decision-Making: Diverse Ranking with Policy-Value Networks. In SIGIR .

[7]

Yoav Freund, Raj Iyer, Robert E Schapire, and Yoram Singer. 2003. An efficient boosting algorithm for combining preferences. JMLR (2003).

[8]

Ralf Herbrich, Thore Graepel, and Klaus Obermayer. 1999. Support vector learning for ordinal regression. (1999).

[9]

Tie-Yan Liu, Jun Xu, Tao Qin, Wenying Xiong, and Hang Li. 2007. Letor: Benchmark dataset for research on learning to rank for information retrieval. In SIGIR .

[10]

Ryan Lowe, Yi Wu, Aviv Tamar, Jean Harb, Pieter Abbeel, and Igor Mordatch. 2017. Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments. NeurIPS .

[11]

Jiyun Luo, Sicong Zhang, and Hui Yang. 2014. Win-win Search: Dual-agent Stochastic Game in Session Search. In SIGIR .

Digital Library

[12]

Liang Pang, Yanyan Lan, Jiafeng Guo, Jun Xu, Jingfang Xu, and Xueqi Cheng. 2017. DeepRank: A New Deep Architecture for Relevance Ranking in Information Retrieval. In CIKM .

[13]

Yelong Shen, Xiaodong He, Jianfeng Gao, Li Deng, and Grégoire Mesnil. 2014. Learning Semantic Representations Using Convolutional Neural Networks for Web Search. In WWW Companion .

[14]

David Silver, Julian Schrittwieser, Karen Simonyan, Ioannis Antonoglou, Aja Huang, Arthur Guez, Thomas Hubert, Lucas Baker, Matthew Lai, Adrian Bolton, et almbox. 2017. Mastering the game of Go without human knowledge. Nature (2017).

[15]

Richard S Sutton and Andrew G Barto. 1998. Introduction to reinforcement learning .MIT press Cambridge.

Digital Library

[16]

Jun Wang, Lantao Yu, Weinan Zhang, Yu Gong, Yinghui Xu, Benyou Wang, Peng Zhang, and Dell Zhang. 2017. Irgan: A minimax game for unifying generative and discriminative information retrieval models. In SIGIR .

Digital Library

[17]

Zeng Wei, Jun Xu, Yanyan Lan, Jiafeng Guo, and Xueqi Cheng. 2017. Reinforcement learning to rank with Markov decision process. In SIGIR .

[18]

Jun Xu and Hang Li. 2007. Adarank: a boosting algorithm for information retrieval. In SIGIR .

[19]

Jun Xu, Tie-Yan Liu, Min Lu, Hang Li, and Wei-Ying Ma. [n. d.]. Directly optimizing evaluation measures in learning to rank. In SIGIR .

[20]

Sicong Zhang, Jiyun Luo, and Hui Yang. 2014. A POMDP Model for Content-free Document Re-ranking. In SIGIR .

[21]

Guanjie Zheng, Fuzheng Zhang, Zihan Zheng, Yang Xiang, Nicholas Jing Yuan, Xing Xie, and Zhenhui Li. 2018. DRN: A Deep Reinforcement Learning Framework for News Recommendation. In WWW .

Digital Library

Cited By

Padhye VLakshmanan K(2023)A deep actor critic reinforcement learning framework for learning to rankNeurocomputing10.1016/j.neucom.2023.126314547(126314)Online publication date: Aug-2023
https://doi.org/10.1016/j.neucom.2023.126314
Peng DChen Y(2023)AMRankExpert Systems with Applications: An International Journal10.1016/j.eswa.2022.118512211:COnline publication date: 1-Jan-2023
https://dl.acm.org/doi/10.1016/j.eswa.2022.118512
Yu HPiryani RJatowt AInagaki RJoho HKim K(2023)An in-depth study on adversarial learning-to-rankInformation Retrieval Journal10.1007/s10791-023-09419-026:1Online publication date: 28-Feb-2023
https://doi.org/10.1007/s10791-023-09419-0
Show More Cited By

Index Terms

MarlRank: Multi-agent Reinforced Learning to Rank
1. Information systems
  1. Information retrieval
    1. Retrieval models and ranking
      1. Learning to rank
      2. Novelty in information retrieval

Recommendations

Quality-biased ranking for queries with commercial intent
WWW '13 Companion: Proceedings of the 22nd International Conference on World Wide Web

Modern search engines are good enough to answer popular commercial queries with mainly highly relevant documents. However, our experiments show that users behavior on such relevant commercial sites may differ from one to another web-site with the same ...
Learning to rank code examples for code search engines

Source code examples are used by developers to implement unfamiliar tasks by learning from existing solutions. To better support developers in finding existing solutions, code search engines are designed to locate and rank code examples relevant to user'...
WassRank: Listwise Document Ranking Using Optimal Transport Theory
WSDM '19: Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining

Learning to rank has been intensively studied and has shown great value in many fields, such as web search, question answering and recommender systems. This paper focuses on listwise document ranking, where all documents associated with the same query ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

CIKM '19: Proceedings of the 28th ACM International Conference on Information and Knowledge Management

November 2019

3373 pages

ISBN:9781450369763

DOI:10.1145/3357384

General Chairs:
Wenwu Zhu
Tsinghua University, China
,
Dacheng Tao
University of Massachusetts, USA
,
Xueqi Cheng
Institute of Computing Technology, CAS, China
,
Program Chairs:
Peng Cui
Tsinghua University, China
,
Elke Rundensteiner
Worcester Polytechnic Institute, USA
,
David Carmel
Amazon Research, USA
,
Qi He
LinkedIn, USA
,
Jeffrey Xu Yu
Chinese University of Hong Kong, China

Copyright © 2019 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 03 November 2019

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Short-paper

Conference

CIKM '19

Sponsor:

CIKM '19: The 28th ACM International Conference on Information and Knowledge Management

November 3 - 7, 2019

Beijing, China

Acceptance Rates

CIKM '19 Paper Acceptance Rate 202 of 1,031 submissions, 20%;

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Sponsor:
sigir
sigir

The 34th ACM International Conference on Information and Knowledge Management

November 10 - 14, 2025

Seoul , Republic of Korea

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

8
Total Citations
View Citations
313
Total Downloads

Downloads (Last 12 months)37
Downloads (Last 6 weeks)7

Reflects downloads up to 19 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Padhye VLakshmanan K(2023)A deep actor critic reinforcement learning framework for learning to rankNeurocomputing10.1016/j.neucom.2023.126314547(126314)Online publication date: Aug-2023
https://doi.org/10.1016/j.neucom.2023.126314
Peng DChen Y(2023)AMRankExpert Systems with Applications: An International Journal10.1016/j.eswa.2022.118512211:COnline publication date: 1-Jan-2023
https://dl.acm.org/doi/10.1016/j.eswa.2022.118512
Yu HPiryani RJatowt AInagaki RJoho HKim K(2023)An in-depth study on adversarial learning-to-rankInformation Retrieval Journal10.1007/s10791-023-09419-026:1Online publication date: 28-Feb-2023
https://doi.org/10.1007/s10791-023-09419-0
Wang SGao CGao MYu JWang ZYin H(2022)Who are the Best Adopters? User Selection Model for Free Trial Item PromotionIEEE Transactions on Big Data10.1109/TBDATA.2022.3205334(1-12)Online publication date: 2022
https://doi.org/10.1109/TBDATA.2022.3205334
Tong XWang PNiu S(2022)Reinforcement learning-based denoising network for sequential recommendationApplied Intelligence10.1007/s10489-022-03298-653:2(1324-1335)Online publication date: 28-Apr-2022
https://doi.org/10.1007/s10489-022-03298-6
Yu HHuang DRen FLi L(2021)Diagnostic Evaluation of Policy-Gradient-Based RankingElectronics10.3390/electronics1101003711:1(37)Online publication date: 23-Dec-2021
https://doi.org/10.3390/electronics11010037
Li RUrbano JHanjalic ADiaz FShah CSuel TCastells PJones RSakai T(2021)New Insights into Metric Optimization for Ranking-based RecommendationProceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3404835.3462973(932-941)Online publication date: 11-Jul-2021
https://dl.acm.org/doi/10.1145/3404835.3462973
Galanin DBukharaev NGusenkov ASittikova A(2020)Using Generative Adversarial Networks for Relevance Evaluation of Search Engine Results2020 IEEE East-West Design & Test Symposium (EWDTS)10.1109/EWDTS50664.2020.9224840(1-7)Online publication date: Sep-2020
https://doi.org/10.1109/EWDTS50664.2020.9224840

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents