research-article

Public Access

Recommendations with Negative Feedback via Pairwise Deep Reinforcement Learning

Authors:

Dawei YinAuthors Info & Claims

KDD '18: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

Pages 1040 - 1048

https://doi.org/10.1145/3219819.3219886

Published: 19 July 2018 Publication History

Abstract

Recommender systems play a crucial role in mitigating the problem of information overload by suggesting users' personalized items or services. The vast majority of traditional recommender systems consider the recommendation procedure as a static process and make recommendations following a fixed strategy. In this paper, we propose a novel recommender system with the capability of continuously improving its strategies during the interactions with users. We model the sequential interactions between users and a recommender system as a Markov Decision Process (MDP) and leverage Reinforcement Learning (RL) to automatically learn the optimal strategies via recommending trial-and-error items and receiving reinforcements of these items from users' feedback. Users' feedback can be positive and negative and both types of feedback have great potentials to boost recommendations. However, the number of negative feedback is much larger than that of positive one; thus incorporating them simultaneously is challenging since positive feedback could be buried by negative one. In this paper, we develop a novel approach to incorporate them into the proposed deep recommender system (DEERS) framework. The experimental results based on real-world e-commerce data demonstrate the effectiveness of the proposed framework. Further experiments have been conducted to understand the importance of both positive and negative feedback in recommendations.

References

[1]

Rajendra Akerkar and Priti Sajja. 2010. Knowledge-based systems. Jones & Bartlett Publishers.

Digital Library

[2]

Richard Bellman. 2013. Dynamic programming. Courier Corporation.

Digital Library

[3]

John S Breese, David Heckerman, and Carl Kadie. 1998. Empirical analysis of predictive algorithms for collaborative filtering Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence. Morgan Kaufmann Publishers Inc., 43--52.

Digital Library

[4]

Robin Burke. 2002. Hybrid recommender systems: Survey and experiments. User modeling and user-adapted interaction Vol. 12, 4 (2002), 331--370.

Digital Library

[5]

Thomas Degris, Martha White, and Richard S Sutton. 2012. Off-policy actor-critic. arXiv preprint arXiv:1205.4839 (2012).

Digital Library

[6]

Georges E Dupret and Benjamin Piwowarski. 2008. A user browsing model to predict search engine click data from past observations. Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval. ACM, 331--338.

Digital Library

[7]

Milos Hauskrecht. 1997. Incremental methods for computing bounds in partially observable Markov decision processes. In AAAI/IAAI. 734--739.

Digital Library

[8]

Balázs Hidasi, Alexandros Karatzoglou, Linas Baltrunas, and Domonkos Tikk. 2015. Session-based recommendations with recurrent neural networks. arXiv preprint arXiv:1511.06939 (2015).

[9]

Kalervo Jarvelin and Jaana Kekalainen. 2002. Cumulated gain-based evaluation of IR techniques. ACM Transactions on Information Systems (TOIS) Vol. 20, 4 (2002), 422--446.

Digital Library

[10]

Michael Kearns, Yishay Mansour, and Andrew Y Ng. 2002. A sparse sampling algorithm for near-optimal planning in large Markov decision processes. Machine learning Vol. 49, 2 (2002), 193--208.

Digital Library

[11]

Omer Levy and Yoav Goldberg. 2014. Neural word embedding as implicit matrix factorization Advances in neural information processing systems. 2177--2185.

Digital Library

[12]

Long-Ji Lin. 1993. Reinforcement learning for robots using neural networks. Technical Report. Carnegie-Mellon Univ Pittsburgh PA School of Computer Science.

[13]

Greg Linden, Brent Smith, and Jeremy York. 2003. Amazon. com recommendations: Item-to-item collaborative filtering. IEEE Internet computing Vol. 7, 1 (2003), 76--80.

Digital Library

[14]

Tariq Mahmood and Francesco Ricci. 2007. Learning and adaptivity in interactive recommender systems Proceedings of the ninth international conference on Electronic commerce. ACM, 75--84.

Digital Library

[15]

Tariq Mahmood and Francesco Ricci. 2009. Improving recommender systems with adaptive conversational strategies Proceedings of the 20th ACM conference on Hypertext and hypermedia. ACM, 73--82.

Digital Library

[16]

Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, and Martin Riedmiller. 2013. Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602 (2013).

[17]

Raymond J Mooney and Loriene Roy. 2000. Content-based book recommending using learning for text categorization Proceedings of the fifth ACM conference on Digital libraries. ACM, 195--204.

Digital Library

[18]

Andrew W Moore and Christopher G Atkeson. 1993. Prioritized sweeping: Reinforcement learning with less data and less time. Machine learning Vol. 13, 1 (1993), 103--130.

Digital Library

[19]

Andrew Y Ng and Michael Jordan. 2000. PEGASUS: A policy search method for large MDPs and POMDPs Proceedings of the Sixteenth conference on Uncertainty in artificial intelligence. Morgan Kaufmann Publishers Inc., 406--415.

Digital Library

[20]

Hanh TH Nguyen, Martin Wistuba, Josif Grabocka, Lucas Rego Drumond, and Lars Schmidt-Thieme. 2017. Personalized Deep Learning for Tag Recommendation. In Pacific-Asia Conference on Knowledge Discovery and Data Mining. Springer, 186--197.

[21]

Pascal Poupart and Craig Boutilier. 2005. VDCBPI: an approximate scalable algorithm for large POMDPs Advances in Neural Information Processing Systems. 1081--1088.

Digital Library

[22]

Steffen Rendle. 2010. Factorization machines. In Data Mining (ICDM), 2010 IEEE 10th International Conference on. IEEE, 995--1000.

Digital Library

[23]

Paul Resnick and Hal R Varian. 1997. Recommender systems. Commun. ACM Vol. 40, 3 (1997), 56--58.

Digital Library

[24]

Francesco Ricci, Lior Rokach, and Bracha Shapira. 2011. Introduction to recommender systems handbook. In Recommender systems handbook. Springer, 1--35.

[25]

Guy Shani, David Heckerman, and Ronen I Brafman. 2005. An MDP-based recommender system. Journal of Machine Learning Research Vol. 6, Sep (2005), 1265--1295.

Digital Library

[26]

Peter Sunehag, Richard Evans, Gabriel Dulac-Arnold, Yori Zwols, Daniel Visentin, and Ben Coppin. 2015. Deep Reinforcement Learning with Attention for Slate Markov Decision Processes with High-Dimensional States and Actions. arXiv preprint arXiv:1512.01124 (2015).

[27]

Nima Taghipour and Ahmad Kardan. 2008. A hybrid web recommender system based on q-learning Proceedings of the 2008 ACM symposium on Applied computing. ACM, 1164--1168.

Digital Library

[28]

Nima Taghipour, Ahmad Kardan, and Saeed Shiry Ghidary. 2007. Usage-based web recommendations: a reinforcement learning approach Proceedings of the 2007 ACM conference on Recommender systems. ACM, 113--120.

Digital Library

[29]

Andrew Turpin and Falk Scholer. 2006. User performance versus precision measures for simple search tasks Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval. ACM, 11--18.

Digital Library

[30]

Qingyun Wu, Hongning Wang, Liangjie Hong, and Yue Shi. 2017. Returning is Believing: Optimizing Long-term User Engagement in Recommender Systems Proceedings of the 2017 ACM on Conference on Information and Knowledge Management. ACM, 1927--1936.

Digital Library

[31]

Sai Wu, Weichao Ren, Chengchao Yu, Gang Chen, Dongxiang Zhang, and Jingbo Zhu. 2016. Personal recommendation using deep recurrent neural networks in NetEase Data Engineering (ICDE), 2016 IEEE 32nd International Conference on. IEEE, 1218--1229.

[32]

Shuai Zhang, Lina Yao, and Aixin Sun. 2017. Deep Learning based Recommender System: A Survey and New Perspectives. arXiv preprint arXiv:1707.07435 (2017).

Digital Library

[33]

Xiangyu Zhao, Long Xia, Liang Zhang, Zhuoye Ding, Dawei Yin, and Jiliang Tang. 2018. Deep Reinforcement Learning for Page-wise Recommendations. arXiv preprint arXiv:1805.02343 (2018).

[34]

Xiangyu Zhao, Liang Zhang, Zhuoye Ding, Dawei Yin, Yihong Zhao, and Jiliang Tang. 2017. Deep Reinforcement Learning for List-wise Recommendations. arXiv preprint arXiv:1801.00209 (2017).

Cited By

Guan CXue RZhang ZLi LLi YYuan LYu YDastani MSichman JAlechina NDignum V(2024)Cost-aware Offline Safe Meta Reinforcement Learning with Robust In-Distribution Online Task AdaptationProceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems10.5555/3635637.3662927(743-751)Online publication date: 6-May-2024
https://dl.acm.org/doi/10.5555/3635637.3662927
Chen RLiu XLiu TJiang SXu FYu YDastani MSichman JAlechina NDignum V(2024)Foresight Distribution Adjustment for Off-policy Reinforcement LearningProceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems10.5555/3635637.3662880(317-325)Online publication date: 6-May-2024
https://dl.acm.org/doi/10.5555/3635637.3662880
Liu YLi GPayne TYue YMan K(2024)Non-Stationary Transformer Architecture: A Versatile Framework for Recommendation SystemsElectronics10.3390/electronics1311207513:11(2075)Online publication date: 27-May-2024
https://doi.org/10.3390/electronics13112075
Show More Cited By

Index Terms

Recommendations with Negative Feedback via Pairwise Deep Reinforcement Learning

Recommendations

Deep reinforcement learning for page-wise recommendations
RecSys '18: Proceedings of the 12th ACM Conference on Recommender Systems

Recommender systems can mitigate the information overload problem by suggesting users' personalized items. In real-world recommendations such as e-commerce, a typical interaction between the system and its users is - users are recommended a page of ...
A deep reinforcement learning based long-term recommender system
Abstract
Recommender systems aim to maximize the overall accuracy for long-term recommendations. However, most of the existing recommendation models adopt a static view, and ignore the fact that recommendation is a dynamic sequential decision-...
Highlights
- A novel top-N interactive recommender system based on deep reinforcement learning is proposed.
CapDRL: A Deep Capsule Reinforcement Learning for Movie Recommendation
PRICAI 2019: Trends in Artificial Intelligence
Abstract
Recommender systems provide users with a personalized list based on individual interests. There are three main challenges in traditional movie recommendation models: (1) considering recommendation procedure as a static one; (2) not taking user’s ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences

KDD '18: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

July 2018

2925 pages

ISBN:9781450355520

DOI:10.1145/3219819

General Chairs:
Yike Guo
Imperial College London
,
Faisal Farooq
IBM

Copyright © 2018 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 19 July 2018

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

National Science Foundation

Conference

KDD '18

Sponsor:

KDD '18: The 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

August 19 - 23, 2018

London, United Kingdom

Acceptance Rates

KDD '18 Paper Acceptance Rate 107 of 983 submissions, 11%;

Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

221
Total Citations
View Citations
4,966
Total Downloads

Downloads (Last 12 months)659
Downloads (Last 6 weeks)69

Reflects downloads up to 23 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

Guan CXue RZhang ZLi LLi YYuan LYu YDastani MSichman JAlechina NDignum V(2024)Cost-aware Offline Safe Meta Reinforcement Learning with Robust In-Distribution Online Task AdaptationProceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems10.5555/3635637.3662927(743-751)Online publication date: 6-May-2024
https://dl.acm.org/doi/10.5555/3635637.3662927
Chen RLiu XLiu TJiang SXu FYu YDastani MSichman JAlechina NDignum V(2024)Foresight Distribution Adjustment for Off-policy Reinforcement LearningProceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems10.5555/3635637.3662880(317-325)Online publication date: 6-May-2024
https://dl.acm.org/doi/10.5555/3635637.3662880
Liu YLi GPayne TYue YMan K(2024)Non-Stationary Transformer Architecture: A Versatile Framework for Recommendation SystemsElectronics10.3390/electronics1311207513:11(2075)Online publication date: 27-May-2024
https://doi.org/10.3390/electronics13112075
Wang XWang LDong CRen HXing K(2024)Reinforcement Learning-Based Dynamic Order Recommendation for On-Demand Food DeliveryTsinghua Science and Technology10.26599/TST.2023.901004129:2(356-367)Online publication date: Apr-2024
https://doi.org/10.26599/TST.2023.9010041
Ahmadkhani SMoghaddam M(2024)A social image recommendation system based on deep reinforcement learningPLOS ONE10.1371/journal.pone.030005919:4(e0300059)Online publication date: 4-Apr-2024
https://doi.org/10.1371/journal.pone.0300059
Wang YGe YLi ZLi LChen R(2024)M3Rec: A Context-Aware Offline Meta-Level Model-Based Reinforcement Learning Approach for Cold-Start RecommendationACM Transactions on Information Systems10.1145/365994742:6(1-27)Online publication date: 19-Aug-2024
https://dl.acm.org/doi/10.1145/3659947
Han XZhu CHu XQin CZhao XZhu HBaeza-Yates RBonchi F(2024)Adapting Job Recommendations to User Preference Drift with Behavioral-Semantic Fusion LearningProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671759(1004-1015)Online publication date: 25-Aug-2024
https://dl.acm.org/doi/10.1145/3637528.3671759
Liu ZLiu SYang BXue ZCai QZhao XZhang ZHu LLi HJiang PBaeza-Yates RBonchi F(2024)Modeling User Retention through Generative Flow NetworksProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671531(5497-5508)Online publication date: 25-Aug-2024
https://dl.acm.org/doi/10.1145/3637528.3671531
Wang XLiu SWang XCai QHu LLi HJiang PGai KXie GBaeza-Yates RBonchi F(2024)Future Impact Decomposition in Request-level RecommendationsProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671506(5905-5916)Online publication date: 25-Aug-2024
https://dl.acm.org/doi/10.1145/3637528.3671506
Yu YGao CChen JTang HSun YChen QMa WZhang MHui Yang GWang HHan SHauff CZuccon GZhang Y(2024)EasyRL4Rec: An Easy-to-use Library for Reinforcement Learning Based Recommender SystemsProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657868(977-987)Online publication date: 10-Jul-2024
https://dl.acm.org/doi/10.1145/3626772.3657868
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents