research-article

AMRank: : An adversarial Markov ranking model combining short- and long-term returns

Authors:

Yichao ChenAuthors Info & Claims

Volume 211, Issue C

https://doi.org/10.1016/j.eswa.2022.118512

Published: 01 January 2023 Publication History

Abstract

Learning to rank (LTR) is a method of ranking search results using machine learning techniques. Currently, the reinforcement-learning-based ranking models have achieved some success in LTR task. However, these models have disadvantages like high variance gradient estimates and train inefficiency, which bring great challenges to the convergence and accuracy of the ranking model. Combining short- and long-term returns, this paper proposes AMRank, an adversarial Markov ranking model, which is based on reinforcement learning and formalizes the ranking task as a Markov decision process. To address the aforementioned weaknesses, in AMRank, we present a sequence discriminator to output a long-term return with a smaller variance and conduct single step updates, and use a document discriminator to yield a short-term return. The two discriminators are trained simultaneously before the decision is made. In the training process, the policy network is applied as a generator to sample candidate documents and get negative samples. At the beginning of the decision, the discriminator outputs the returns based on the environment state and the policy, and finally updates the parameters of the policy network using the policy gradient method. Experimental results on three LETOR benchmark datasets, OHUSMED, MQ2007 and MQ2008, demonstrate that the proposed AMRank outperforms the baseline models in document ranking task.

Highlights

•

AMRank, a novel document ranking model, is proposed, which combines MDP and GAN.

•

AMRank employs long- and short-term returns to improve decision making.

•

A sequence discriminator is presented to generate long-term returns.

•

AMRank can realize one-step update and output return with less variance.

References

[1]

Ai Q., Wang X., Bruch S., Golbandi N., Bendersky M., Najork M., Learning groupwise multivariate scoring functions using deep neural networks, in: Fang Y., Zhang Y., Allan J., Balog K., Carterette B., Guo J. (Eds.), Proceedings of the 2019 ACM SIGIR international conference on theory of information retrieval, ACM, 2019, pp. 85–92,.

Digital Library

[2]

Ali Z., Kefalas P., Muhammad K., Ali B., Imran M., Deep learning in citation recommendation models survey, Expert Systems with Applications 162 (2020),.

[3]

Ali Z., Qi G., Muhammad K., Kefalas P., Khusro S., Global citation recommendation employing generative adversarial network, Expert Systems with Applications 180 (2021),.

Digital Library

[4]

Burges, C. J. (2010). From ranknet to lambdarank to LambdaMART: An overview: Technical report MSR-TR-2010-82, URL https://www.microsoft.com/en-us/research/publication/from-ranknet-to-lambdarank-to-lambdamart-an-overview/.

[5]

Burges C.J.C., Ragno R., Le Q.V., Learning to rank with nonsmooth cost functions, in: Schölkopf B., Platt J.C., Hofmann T. (Eds.), Advances in neural information processing systems 19, proceedings of the twentieth annual conference on neural information processing systems, MIT Press, 2006, pp. 193–200. URL https://proceedings.neurips.cc/paper/2006/hash/af44c4c56f385c43f2529f9b1b018f6a-Abstract.html.

[6]

Cao Z., Qin T., Liu T., Tsai M., Li H., Learning to rank: From pairwise approach to listwise approach, in: Ghahramani Z. (Ed.), Machine Learning, proceedings of the twenty-fourth international conference, in: ACM international conference proceeding series, vol. 227, ACM, 2007, pp. 129–136,.

Digital Library

[7]

Chapelle O., Keerthi S.S., Efficient algorithms for ranking with SVMs, Information Retrieval 13 (3) (2010) 201–215,.

Digital Library

[8]

Clarke C.L.A., Kolla M., Cormack G.V., Vechtomova O., Ashkan A., Büttcher S., MacKinnon I., Novelty and diversity in information retrieval evaluation, in: Myaeng S., Oard D.W., Sebastiani F., Chua T., Leong M. (Eds.), proceedings of the 31st annual international ACM SIGIR conference on research and development in information retrieval, ACM, 2008, pp. 659–666,.

Digital Library

[9]

Cossock D., Zhang T., Subset ranking using regression, in: Lugosi G., Simon H.U. (Eds.), Learning theory, 19th annual conference on learning theory, in: Lecture notes in computer science, vol. 4005, Springer, 2006, pp. 605–619,.

Digital Library

[10]

Dehghani M., Zamani H., Severyn A., Kamps J., Croft W.B., Neural ranking models with weak supervision, in: Kando N., Sakai T., Joho H., Li H., de Vries A.P., White R.W. (Eds.), Proceedings of the 40th international ACM SIGIR conference on research and development in information retrieval, ACM, 2017, pp. 65–74,.

Digital Library

[11]

Feng Y., Xu J., Lan Y., Guo J., Zeng W., Cheng X., From greedy selection to exploratory decision-making: Diverse ranking with policy-value networks, in: Collins-Thompson K., Mei Q., Davison B.D., Liu Y., Yilmaz E. (Eds.), The 41st international ACM SIGIR conference on research & development in information retrieval, ACM, 2018, pp. 125–134,.

Digital Library

[12]

Gampa P., Fujita S., BanditRank: Learning to rank using contextual bandits, in: Karlapalem K., Cheng H., Ramakrishnan N., Agrawal R.K., Reddy P.K., Srivastava J., Chakraborty T. (Eds.), Advances in knowledge discovery and data mining - 25th Pacific-Asia conference, in: Lecture notes in computer science, vol. 12714, Springer, 2021, pp. 259–271,.

Digital Library

[13]

Goodfellow I.J., Pouget-Abadie J., Mirza M., Xu B., Warde-Farley D., Ozair S., Courville A.C., Bengio Y., Generative adversarial nets, in: Ghahramani Z., Welling M., Cortes C., Lawrence N.D., Weinberger K.Q. (Eds.), Advances in neural information processing systems 27: Annual conference on neural information processing systems 2014, 2014, pp. 2672–2680. URL https://proceedings.neurips.cc/paper/2014/hash/5ca3e9b122f61f8f06494c97b1afccf3-Abstract.html.

[14]

Guo J., Fan Y., Pang L., Yang L., Ai Q., Zamani H., Wu C., Croft W.B., Cheng X., A deep look into neural ranking models for information retrieval, Information Processing & Management 57 (6) (2020),.

[15]

He K., Zhang X., Ren S., Sun J., Deep residual learning for image recognition, in: 2016 IEEE conference on computer vision and pattern recognition, IEEE Computer Society, 2016, pp. 770–778,.

[16]

Jain M., Kamath S.S., Improving convergence in IRGAN with PPO, in: Roy R.S. (Ed.), CoDS-COMAD 2020: 7th ACM IKDD CoDS and 25th COMAD, ACM, 2020, pp. 328–329,.

Digital Library

[17]

Peng D., Yang W., Liu C., Lü S., SAM-GAN: Self-attention supporting multi-stage generative adversarial networks for text-to-image synthesis, Neural Networks 138 (2021) 57–67,.

[18]

Pobrotyn P., Bartczak T., Synowiec M., Bialobrzeski R., Bojar J., Context-aware learning to rank with self-attention, 2020, CoRR abs/2005.10084, URL https://arxiv.org/abs/2005.10084.

[19]

Pobrotyn P., Bialobrzeski R., Neuralsndcg: Direct optimisation of a ranking metric via differentiable relaxation of sorting, 2021, CoRR abs/2102.07831, URL https://arxiv.org/abs/2102.07831.

[20]

Qin T., Liu T., Xu J., Li H., LETOR: A benchmark collection for research on learning to rank for information retrieval, Information Retrieval 13 (4) (2010) 346–374,.

Digital Library

[21]

Severyn A., Moschitti A., Learning to rank short text pairs with convolutional deep neural networks, in: Baeza-Yates R., Lalmas M., Moffat A., Ribeiro-Neto B.A. (Eds.), Proceedings of the 38th international ACM SIGIR conference on research and development in information retrieval, ACM, 2015, pp. 373–382,.

Digital Library

[22]

Sheetrit E., Shtok A., Kurland O., A passage-based approach to learning to rank documents, Information Retrieval Journal 23 (2) (2020) 159–186,.

Digital Library

[23]

Sutton R.S., Barto A.G., Reinforcement learning: An introduction, MIT Press, 2018, URL https://www-inst.eecs.berkeley.edu//~cs188/sp20/assets/files/SuttonBartoIPRLBook2ndEd.pdf.

Digital Library

[24]

Sutton R.S., McAllester D.A., Singh S.P., Mansour Y., Policy gradient methods for reinforcement learning with function approximation, in: Solla S.A., Leen T.K., Müller K. (Eds.), Advances in neural information processing systems 12, The MIT Press, 1999, pp. 1057–1063. URL https://proceedings.neurips.cc/paper/1999/hash/464d828b85b0bed98e80ade0a5c43b0f-Abstract.html.

[25]

Vaswani A., Shazeer N., Parmar N., Uszkoreit J., Jones L., Gomez A.N., Kaiser L., Polosukhin I., Attention is all you need, in: Guyon I., von Luxburg U., Bengio S., Wallach H.M., Fergus R., Vishwanathan S.V.N., Garnett R. (Eds.), Advances in neural information processing systems 30: Annual conference on neural information processing systems 2017, 2017, pp. 5998–6008. URL https://proceedings.neurips.cc/paper/2017/hash/3f5ee243547dee91fbd053c1c4a845aa-Abstract.html.

[26]

Wang J., Yu L., Zhang W., Gong Y., Xu Y., Wang B., Zhang P., Zhang D., IRGAN: A minimax game for unifying generative and discriminative information retrieval models, in: Kando N., Sakai T., Joho H., Li H., de Vries A.P., White R.W. (Eds.), Proceedings of the 40th international ACM SIGIR conference on research and development in information retrieval, ACM, 2017, pp. 515–524,.

Digital Library

[27]

Wei Z., Xu J., Lan Y., Guo J., Cheng X., Reinforcement learning to rank with Markov decision process, in: Kando N., Sakai T., Joho H., Li H., de Vries A.P., White R.W. (Eds.), Proceedings of the 40th international ACM SIGIR conference on research and development in information retrieval, ACM, 2017, pp. 945–948,.

Digital Library

[28]

Williams R.J., Simple statistical gradient-following algorithms for connectionist reinforcement learning, Machine Learning 8 (1992) 229–256,.

Digital Library

[29]

Wu Q., Liu Y., Miao C., Zhao B., Zhao Y., Guan L., PD-GAN: Adversarial learning for personalized diversity-promoting recommendation, in: Kraus S. (Ed.), Proceedings of the twenty-eighth international joint conference on artificial intelligence, ijcai.org, 2019, pp. 3870–3876,.

[30]

Xia L., Xu J., Lan Y., Guo J., Zeng W., Cheng X., Adapting Markov decision process for search result diversification, in: Kando N., Sakai T., Joho H., Li H., de Vries A.P., White R.W. (Eds.), Proceedings of the 40th international ACM SIGIR conference on research and development in information retrieval, ACM, 2017, pp. 535–544,.

Digital Library

[31]

Xu J., Li H., Adarank: A boosting algorithm for information retrieval, in: Kraaij W., de Vries A.P., Clarke C.L.A., Fuhr N., Kando N. (Eds.), SIGIR 2007: Proceedings of the 30th annual international ACM SIGIR conference on research and development in information retrieval, ACM, 2007, pp. 391–398,.

Digital Library

[32]

Xu J., Wei Z., Xia L., Lan Y., Yin D., Cheng X., Wen J., Reinforcement learning to rank with pairwise policy gradient, in: Huang J., Chang Y., Cheng X., Kamps J., Murdock V., Wen J., Liu Y. (Eds.), Proceedings of the 43rd international ACM SIGIR conference on research and development in information retrieval, ACM, 2020, pp. 509–518,.

Digital Library

[33]

Yue Y., Finley T., Radlinski F., Joachims T., A support vector method for optimizing average precision, in: Kraaij W., de Vries A.P., Clarke C.L.A., Fuhr N., Kando N. (Eds.), SIGIR 2007: Proceedings of the 30th annual international ACM SIGIR conference on research and development in information retrieval, ACM, 2007, pp. 271–278,.

Digital Library

[34]

Zhao K., Wang X., Zhang Y., Zhao L., Liu Z., Xing C., Xie X., Leveraging demonstrations for reinforcement recommendation reasoning over knowledge graphs, in: Huang J., Chang Y., Cheng X., Kamps J., Murdock V., Wen J., Liu Y. (Eds.), Proceedings of the 43rd international ACM SIGIR conference on research and development in information retrieval, ACM, 2020, pp. 239–248,.

Digital Library

[35]

Zhou J., Agichtein E., RLIRank: Learning to rank with reinforcement learning for dynamic search, in: Huang Y., King I., Liu T., van Steen M. (Eds.), WWW ’20: The web conference 2020, ACM / IW3C2, 2020, pp. 2842–2848,.

Digital Library

[36]

Zhu X., Klabjan D., Listwise learning to rank by exploring unique ratings, in: Caverlee J., Hu X.B., Lalmas M., Wang W. (Eds.), WSDM ’20: The thirteenth ACM international conference on web search and data mining, ACM, 2020, pp. 798–806,.

Digital Library

[37]

Zou S., Li Z., Akbari M., Wang J., Zhang P., MarlRank: Multi-agent reinforced learning to rank, in: Zhu W., Tao D., Cheng X., Cui P., Rundensteiner E.A., Carmel D., He Q., Yu J.X. (Eds.), Proceedings of the 28th ACM international conference on information and knowledge management, ACM, 2019, pp. 2073–2076,.

Digital Library

Cited By

Peng DJi L(2024)SRM-TGAKnowledge-Based Systems10.1016/j.knosys.2024.111763294:COnline publication date: 21-Jun-2024
https://dl.acm.org/doi/10.1016/j.knosys.2024.111763

Index Terms

AMRank: An adversarial Markov ranking model combining short- and long-term returns
1. Theory of computation
  1. Theory and algorithms for application domains

Index terms have been assigned to the content through auto-classification.

Recommendations

Time-aware evidence ranking for fact-checking
Abstract
Truth can vary over time. Fact-checking decisions on claim veracity should therefore take into account temporal information of both the claim and supporting or refuting evidence. In this work, we investigate the hypothesis that the ...
Quality-biased ranking for queries with commercial intent
WWW '13 Companion: Proceedings of the 22nd International Conference on World Wide Web

Modern search engines are good enough to answer popular commercial queries with mainly highly relevant documents. However, our experiments show that users behavior on such relevant commercial sites may differ from one to another web-site with the same ...
A deep actor critic reinforcement learning framework for learning to rank
Abstract
In this paper, we propose a Deep Reinforcement learning based approach for Learning to rank task. Reinforcement Learning has been applied in the ranking task with good success, but the existing Policy Gradient based approaches suffer ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Expert Systems with Applications: An International Journal

Expert Systems with Applications: An International Journal Volume 211, Issue C

Jan 2023

1635 pages

ISSN:0957-4174

Issue’s Table of Contents

Elsevier Ltd.

Publisher

Pergamon Press, Inc.

United States

Publication History

Published: 01 January 2023

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 22 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Peng DJi L(2024)SRM-TGAKnowledge-Based Systems10.1016/j.knosys.2024.111763294:COnline publication date: 21-Jun-2024
https://dl.acm.org/doi/10.1016/j.knosys.2024.111763

View Options

View options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Issue’s Table of Contents