research-article

MINDSim: User Simulator for News Recommenders

Authors:

Dongsheng LiAuthors Info & Claims

WWW '22: Proceedings of the ACM Web Conference 2022

Pages 2067 - 2077

https://doi.org/10.1145/3485447.3512080

Published: 25 April 2022 Publication History

Abstract

Recommender system is playing an increasingly important role in online news platforms nowadays. Recently, there is a growing demand for applying reinforcement learning (RL) algorithms to news recommendation aiming to maximize long-term and/or non-differentiable objectives. However, without an interactive simulated environment, it is extremely costly to develop powerful RL agents for news recommendation. In this paper, we build a user simulator, namely MINDSim, for news recommendation. Targeting at new user generation and corresponding behavior simulation, we first construct a hidden space for users using a generative adversarial network, so that new users can be generated by sampling from this hidden space. To capture complex and fast user interest drifts over time, we adopt an encoder-decoder architecture, which takes the clicked news during the simulation as input and outputs the new user interests for the next period of time. Finally, we build the MINDSim simulator using MIcrosoft News Dataset (MIND), and extensive experimental results on this large-scale real-world dataset demonstrate that MINDSim can simulate the behaviors of real users with high quality.

References

[1]

Martin Arjovsky, Soumith Chintala, and Léon Bottou. 2017. Wasserstein generative adversarial networks. In ICML.

[2]

Xueying Bai, Jian Guan, and Hongning Wang. 2019. A Model-Based Reinforcement Learning with Adversarial Training for Online Recommendation. NeurIPS (2019).

[3]

Hangbo Bao, Li Dong, Furu Wei, Wenhui Wang, Nan Yang, Xiaodong Liu, Yu Wang, Jianfeng Gao, Songhao Piao, Ming Zhou, 2020. Unilmv2: Pseudo-masked language models for unified language model pre-training. In ICML.

[4]

Qingpeng Cai, Aris Filos-Ratsikas, Pingzhong Tang, and Yiwei Zhang. 2018. Reinforcement Mechanism Design for e-commerce. In WWW.

[5]

Haokun Chen, Xinyi Dai, Han Cai, Weinan Zhang, Xuejian Wang, Ruiming Tang, Yuzhou Zhang, and Yong Yu. 2019. Large-scale interactive recommendation with tree-structured policy gradient. In AAAI.

[6]

Minmin Chen, Alex Beutel, Paul Covington, Sagar Jain, Francois Belletti, and Ed H Chi. 2019. Top-k off-policy correction for a REINFORCE recommender system. In WSDM.

[7]

Shi-Yong Chen, Yang Yu, Qing Da, Jun Tan, Hai-Kuan Huang, and Hai-Hong Tang. 2018. Stabilizing reinforcement learning in dynamic environment with application to online recommendation. In KDD.

[8]

Xinshi Chen, Shuang Li, Hui Li, Shaohua Jiang, Yuan Qi, and Le Song. 2019. Generative adversarial user model for reinforcement learning based recommendation system. In ICML.

[9]

Jamell Dacon and Haochen Liu. 2021. Does Gender Matter in the News? Detecting and Examining Gender Bias in News Articles. In Companion Proceedings of the Web Conference 2021. 385–392.

[10]

Zihang Dai, Zhilin Yang, Yiming Yang, Jaime G Carbonell, Quoc Le, and Ruslan Salakhutdinov. 2019. Transformer-XL: Attentive Language Models beyond a Fixed-Length Context. In ACL.

[11]

Linhao Dong, Shuang Xu, and Bo Xu. 2018. Speech-transformer: a no-recurrence sequence-to-sequence model for speech recognition. In ICASSP.

[12]

Jean Feydy, Thibault Séjourné, François-Xavier Vialard, Shun-ichi Amari, Alain Trouve, and Gabriel Peyré. 2019. Interpolating between Optimal Transport and MMD using Sinkhorn Divergences. In The 22nd International Conference on Artificial Intelligence and Statistics. 2681–2690.

[13]

Ishaan Gulrajani, Faruk Ahmed, Martin Arjovsky, Vincent Dumoulin, and Aaron C Courville. 2017. Improved Training of Wasserstein GANs. In NeurIPS.

[14]

Huifeng Guo, Ruiming Tang, Yunming Ye, Zhenguo Li, and Xiuqiang He. 2017. DeepFM: a factorization-machine based neural network for CTR prediction. In IJCAI.

[15]

Xiangnan He, Lizi Liao, Hanwang Zhang, Liqiang Nie, Xia Hu, and Tat-Seng Chua. 2017. Neural collaborative filtering. In WWW.

[16]

Dan Hendrycks and Kevin Gimpel. 2016. Gaussian error linear units (gelus). arXiv preprint arXiv:1606.08415(2016).

[17]

Jonathan Ho and Stefano Ermon. 2016. Generative adversarial imitation learning. In NeurIPS.

[18]

Yujing Hu, Qing Da, Anxiang Zeng, Yang Yu, and Yinghui Xu. 2018. Reinforcement learning to rank in e-commerce search engine: Formalization, analysis, and application. In KDD.

[19]

Eugene Ie, Vihan Jain, Jing Wang, Sanmit Narvekar, Ritesh Agarwal, Rui Wu, Heng-Tze Cheng, Tushar Chandra, and Craig Boutilier. 2019. SlateQ: A tractable decomposition for reinforcement learning with recommendation sets. arXiv preprint (2019).

[20]

Eugene Ie, Chih wei Hsu, Martin Mladenov, Vihan Jain, Sanmit Narvekar, Jing Wang, Rui Wu, and Craig Boutilier. 2019. RecSim: A Configurable Simulation Platform for Recommender Systems. arXiv preprint arXiv:1909.04847(2019).

[21]

Wang-Cheng Kang and Julian McAuley. 2018. Self-attentive sequential recommendation. In ICDM.

[22]

John F Kolen and Stefan C Kremer. 2001. A field guide to dynamical recurrent networks. John Wiley & Sons.

[23]

Lihong Li, Wei Chu, John Langford, and Xuanhui Wang. 2011. Unbiased Offline Evaluation of Contextual-Bandit-Based News Article Recommendation Algorithms. In WSDM.

[24]

Jianxun Lian, Fuzheng Zhang, Xing Xie, and Guangzhong Sun. 2018. Towards Better Representation Learning for Personalized News Recommendation: a Multi-Channel Deep Fusion Approach. In IJCAI. 3805–3811.

[25]

Jiahui Liu, Peter Dolan, and Elin Rønby Pedersen. 2010. Personalized news recommendation based on click behavior. In Proceedings of the 15th international conference on Intelligent user interfaces. 31–40.

Digital Library

[26]

Zheng Liu, Yu Xing, Fangzhao Wu, Mingxiao An, and Xing Xie. 2019. Hi-Fi Ark: Deep User Representation via High-Fidelity Archive Network. In IJCAI.

[27]

Andrew L Maas, Awni Y Hannun, and Andrew Y Ng. 2013. Rectifier nonlinearities improve neural network acoustic models. In ICML.

[28]

Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A Rusu, Joel Veness, Marc G Bellemare, Alex Graves, Martin Riedmiller, Andreas K Fidjeland, Georg Ostrovski, 2015. Human-level control through deep reinforcement learning. Nature 518, 7540 (2015), 529.

[29]

Shumpei Okura, Yukihiro Tagami, Shingo Ono, and Akira Tajima. 2017. Embedding-based news recommendation for millions of users. In KDD.

[30]

Giorgio Patrini, Rianne van den Berg, Patrick Forre, Marcello Carioni, Samarth Bhargav, Max Welling, Tim Genewein, and Frank Nielsen. 2020. Sinkhorn autoencoders. In UAI.

[31]

Gabriel Peyré, Marco Cuturi, 2019. Computational optimal transport: With applications to data science. Foundations and Trends® in Machine Learning 11, 5-6(2019), 355–607.

[32]

Marc’Aurelio Ranzato, Sumit Chopra, Michael Auli, and Wojciech Zaremba. 2015. Sequence level training with recurrent neural networks. arXiv preprint arXiv:1511.06732(2015).

[33]

Steffen Rendle. 2010. Factorization machines. In ICDM. IEEE.

[34]

Steffen Rendle, Christoph Freudenthaler, Zeno Gantner, and Lars Schmidt-Thieme. 2009. BPR: Bayesian personalized ranking from implicit feedback. In UAI.

Digital Library

[35]

David Rohde, Stephen Bonner, Travis Dunlop, Flavian Vasile, and Alexandros Karatzoglou. 2018. Recogym: A reinforcement learning environment for the problem of product recommendation in online advertising. arXiv preprint arXiv:1808.00720(2018).

[36]

John Schulman, Sergey Levine, Pieter Abbeel, Michael Jordan, and Philipp Moritz. 2015. Trust region policy optimization. In ICML. PMLR.

[37]

Wenjie Shang, Yang Yu, Qingyang Li, Zhiwei Qin, Yiping Meng, and Jieping Ye. 2019. Environment reconstruction with hidden confounders for reinforcement learning based recommendation. In KDD.

[38]

Jing-Cheng Shi, Yang Yu, Qing Da, Shi-Yong Chen, and An-Xiang Zeng. 2019. Virtual-taobao: Virtualizing real-world online retail environment for reinforcement learning. In AAAI.

[39]

Richard S Sutton. 1990. Integrated architectures for learning, planning, and reacting based on approximating dynamic programming. In Machine learning proceedings 1990. Elsevier, 216–224.

[40]

Richard S Sutton, Andrew G Barto, 1998. Introduction to reinforcement learning. Vol. 135. MIT press Cambridge.

[41]

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is All you Need. NeurIPS (2017).

[42]

Sanne Vrijenhoek, Mesut Kaya, Nadia Metoui, Judith Möller, Daan Odijk, and Natali Helberger. 2021. Recommenders with a mission: assessing diversity in news recommendations. In Proceedings of the 2021 Conference on Human Information Interaction and Retrieval. 173–183.

Digital Library

[43]

et al. Wayne Xin Zhao. 2020. RecBole: Towards a Unified, Comprehensive and Efficient Framework for Recommendation Algorithms. arXiv preprint arXiv:2011.01731(2020).

[44]

Chuhan Wu, Fangzhao Wu, Yongfeng Huang, and Xing Xie. 2020. Neural news recommendation with negative feedback. CCF Transactions on Pervasive Computing and Interaction 2, 3(2020), 178–188.

[45]

Fangzhao Wu, Ying Qiao, Jiun-Hung Chen, Chuhan Wu, Tao Qi, Jianxun Lian, Danyang Liu, Xing Xie, Jianfeng Gao, Winnie Wu, and Ming Zhou. 2020. MIND: A Large-scale Dataset for News Recommendation. In ACL. https://msnews.github.io/.

[46]

Shitao Xiao, Zheng Liu, Yingxia Shao, Tao Di, and Xing Xie. 2021. Training Large-Scale News Recommenders with Pretrained Language Models in the Loop. arXiv preprint arXiv:2102.09268(2021).

[47]

Shuo Zhang and Krisztian Balog. 2020. Evaluating Conversational Recommender Systems via User Simulation. In KDD.

[48]

Xiangyu Zhao, Liang Zhang, Zhuoye Ding, Long Xia, Jiliang Tang, and Dawei Yin. 2018. Recommendations with negative feedback via pairwise deep reinforcement learning. In KDD.

[49]

Guanjie Zheng, Fuzheng Zhang, Zihan Zheng, Yang Xiang, Nicholas Jing Yuan, Xing Xie, and Zhenhui Li. 2018. DRN: A deep reinforcement learning framework for news recommendation. In WWW.

Digital Library

[50]

Lixin Zou, Long Xia, Pan Du, Zhuo Zhang, Ting Bai, Weidong Liu, Jian-Yun Nie, and Dawei Yin. 2020. Pseudo Dyna-Q: A reinforcement learning framework for interactive recommendation. In WSDM.

Cited By

Moriyoshi KShibata HTakama Y(2024)Generation of Rating Matrix Based on Rational Behaviors of UsersJournal of Advanced Computational Intelligence and Intelligent Informatics10.20965/jaciii.2024.p012928:1(129-140)Online publication date: 20-Jan-2024
https://doi.org/10.20965/jaciii.2024.p0129
Zhang GLi DGu HLu TShang LGu N(2024)Simulating News Recommendation Ecosystems for Insights and ImplicationsIEEE Transactions on Computational Social Systems10.1109/TCSS.2024.338132911:5(5699-5713)Online publication date: Oct-2024
https://doi.org/10.1109/TCSS.2024.3381329
Wu YMacdonald COunis I(2023)Goal-Oriented Multi-Modal Interactive Recommendation with Verbal and Non-Verbal Relevance FeedbackProceedings of the 17th ACM Conference on Recommender Systems10.1145/3604915.3608775(362-373)Online publication date: 14-Sep-2023
https://dl.acm.org/doi/10.1145/3604915.3608775

Index Terms

MINDSim: User Simulator for News Recommenders
1. Theory of computation
  1. Theory and algorithms for application domains

Index terms have been assigned to the content through auto-classification.

Recommendations

Personalized News Recommendation: Methods and Challenges
Personalized news recommendation is important for users to find interesting news information and alleviate information overload. Although it has been extensively studied over decades and has achieved notable success in improving user experience, there are ...
Modeling and broadening temporal user interest in personalized news recommendation

An experimental study on user interest evolution in real-world recommender systems.Integrating the long-term and short-term reading preferences of users.Selecting news from the user-item affinity graph using absorbing random walk model.Extensive ...
News Session-Based Recommendations using Deep Neural Networks
DLRS 2018: Proceedings of the 3rd Workshop on Deep Learning for Recommender Systems

News recommender systems are aimed to personalize users experiences and help them to discover relevant articles from a large and dynamic search space. Therefore, news domain is a challenging scenario for recommendations, due to its sparse user profiling, ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

WWW '22: Proceedings of the ACM Web Conference 2022

April 2022

3764 pages

ISBN:9781450390965

DOI:10.1145/3485447

Editors:
Frédérique Laforest
INSA Lyon, France
,
Raphaël Troncy
EURECOM, France
,
Elena Simperl
King’s College London, UK
,
Deepak Agarwal
Pinterest, USA
,
Aristides Gionis
KTH Royal Institute of Technology, Sweden
,
Ivan Herman
W3C / retired
,
Lionel Médini
Université Lyon 1, France

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 April 2022

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

WWW '22

Sponsor:

SIGWEB

WWW '22: The ACM Web Conference 2022

April 25 - 29, 2022

Virtual Event, Lyon, France

Acceptance Rates

Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

3
Total Citations
View Citations
469
Total Downloads

Downloads (Last 12 months)98
Downloads (Last 6 weeks)7

Reflects downloads up to 18 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Moriyoshi KShibata HTakama Y(2024)Generation of Rating Matrix Based on Rational Behaviors of UsersJournal of Advanced Computational Intelligence and Intelligent Informatics10.20965/jaciii.2024.p012928:1(129-140)Online publication date: 20-Jan-2024
https://doi.org/10.20965/jaciii.2024.p0129
Zhang GLi DGu HLu TShang LGu N(2024)Simulating News Recommendation Ecosystems for Insights and ImplicationsIEEE Transactions on Computational Social Systems10.1109/TCSS.2024.338132911:5(5699-5713)Online publication date: Oct-2024
https://doi.org/10.1109/TCSS.2024.3381329
Wu YMacdonald COunis I(2023)Goal-Oriented Multi-Modal Interactive Recommendation with Verbal and Non-Verbal Relevance FeedbackProceedings of the 17th ACM Conference on Recommender Systems10.1145/3604915.3608775(362-373)Online publication date: 14-Sep-2023
https://dl.acm.org/doi/10.1145/3604915.3608775
Liu YDavanloo Tajbakhsh S(2023)Stochastic Composition Optimization of Functions Without Lipschitz Continuous GradientJournal of Optimization Theory and Applications10.1007/s10957-023-02180-w198:1(239-289)Online publication date: 10-Mar-2023
https://dl.acm.org/doi/10.1007/s10957-023-02180-w

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Table of Contents