Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3447548.3467089acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Exploration in Online Advertising Systems with Deep Uncertainty-Aware Learning

Published: 14 August 2021 Publication History

Abstract

Modern online advertising systems inevitably rely on personalization methods, such as click-through rate (CTR) prediction. Recent progress in CTR prediction enjoys the rich representation capabilities of deep learning and achieves great success in large-scale industrial applications. However, these methods can suffer from lack of exploration. Another line of prior work addresses the exploration-exploitation trade-off problem with contextual bandit methods, which are recently less studied in the industry due to the difficulty in extending their flexibility with deep models. In this paper, we propose a novel Deep Uncertainty-Aware Learning (DUAL) method to learn CTR models based on Gaussian processes, which can provide predictive uncertainty estimations while maintaining the flexibility of deep neural networks. DUAL can be easily implemented on existing models and deployed in real-time systems with minimal extra computational overhead. By linking the predictive uncertainty estimation ability of DUAL to well-known bandit algorithms, we further present DUAL-based Ad-ranking strategies to boost up long-term utilities such as the social welfare in advertising systems. Experimental results on several public datasets demonstrate the effectiveness of our methods. Remarkably, an online A/B test deployed in the Alibaba display advertising platform shows an 8.2% social welfare improvement and an 8.0% revenue lift.

Supplementary Material

MP4 File (KDD21-ads1760.mp4)
Presentation video

References

[1]
Peter Auer. 2002. Using confidence bounds for exploitation-exploration trade-offs. Journal of Machine Learning Research, Vol. 3, Nov (2002), 397--422.
[2]
Marko Balabanović. 1998. Exploring versus exploiting when learning user models for text recommendation. User Modeling and User-Adapted Interaction, Vol. 8, 1--2 (1998), 71--102.
[3]
Charles Blundell, Julien Cornebise, Koray Kavukcuoglu, and Daan Wierstra. 2015. Weight uncertainty in neural networks. arXiv preprint arXiv:1505.05424 (2015).
[4]
Léon Bottou. 2010. Large-scale machine learning with stochastic gradient descent. In Proceedings of COMPSTAT'2010. Springer, 177--186.
[5]
Andrei Z Broder. 2008. Computational advertising and recommender systems. In Proceedings of the 2008 ACM conference on Recommender systems. 1--2.
[6]
Anupam Chander. 2016. The racist algorithm. Mich. L. Rev., Vol. 115 (2016), 1023.
[7]
Allison JB Chaney, Brandon M Stewart, and Barbara E Engelhardt. 2018. How algorithmic confounding in recommendation systems increases homogeneity and decreases utility. In Proceedings of the 12th ACM Conference on Recommender Systems. 224--232.
[8]
Olivier Chapelle and Lihong Li. 2011. An empirical evaluation of thompson sampling. In Advances in neural information processing systems. 2249--2257.
[9]
Heng-Tze Cheng, Levent Koc, Jeremiah Harmsen, Tal Shaked, Tushar Chandra, Hrishi Aradhye, Glen Anderson, Greg Corrado, Wei Chai, Mustafa Ispir, et al. 2016. Wide & deep learning for recommender systems. In Proceedings of the 1st workshop on deep learning for recommender systems. 7--10.
[10]
Wei Chu, Lihong Li, Lev Reyzin, and Robert Schapire. 2011. Contextual bandits with linear payoff functions. In Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics. 208--214.
[11]
Bianca Dumitrascu, Karen Feng, and Barbara Engelhardt. 2018. PG-TS: Improved Thompson sampling for logistic contextual bandits. In Advances in neural information processing systems. 4624--4633.
[12]
Benjamin Edelman, Michael Ostrovsky, and Michael Schwarz. 2007. Internet advertising and the generalized second-price auction: Selling billions of dollars worth of keywords. American economic review, Vol. 97, 1 (2007), 242--259.
[13]
Yufei Feng, Fuyu Lv, Weichen Shen, Menghan Wang, Fei Sun, Yu Zhu, and Keping Yang. 2019. Deep session interest network for click-through rate prediction. arXiv preprint arXiv:1905.06482 (2019).
[14]
Sarah Filippi, Olivier Cappe, Aurélien Garivier, and Csaba Szepesvári. 2010. Parametric bandits: The generalized linear case. In Advances in Neural Information Processing Systems. 586--594.
[15]
Yarin Gal and Zoubin Ghahramani. 2016. Dropout as a bayesian approximation: Representing model uncertainty in deep learning. In international conference on machine learning. 1050--1059.
[16]
Alex Graves. 2011. Practical variational inference for neural networks. In Advances in neural information processing systems. 2348--2356.
[17]
Dalin Guo, Sofia Ira Ktena, Ferenc Huszar, Pranay Kumar Myana, Wenzhe Shi, and Alykhan Tejani. 2020. Deep Bayesian Bandits: Exploring in Online Personalized Recommendations. arxiv: 2008.00727 [cs.LG]
[18]
Huifeng Guo, Ruiming Tang, Yunming Ye, Zhenguo Li, and Xiuqiang He. 2017. DeepFM: a factorization-machine based neural network for CTR prediction. arXiv preprint arXiv:1703.04247 (2017).
[19]
Xiangnan He, Lizi Liao, Hanwang Zhang, Liqiang Nie, Xia Hu, and Tat-Seng Chua. 2017. Neural collaborative filtering. In Proceedings of the 26th international conference on world wide web. 173--182.
[20]
James Hensman, Nicolo Fusi, and Neil D Lawrence. 2013. Gaussian processes for big data. arXiv preprint arXiv:1309.6835 (2013).
[21]
James Hensman, Alexander Matthews, and Zoubin Ghahramani. 2015. Scalable variational Gaussian process classification. (2015).
[22]
Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).
[23]
Andreas Krause and Cheng S Ong. 2011. Contextual gaussian process bandit optimization. In Advances in neural information processing systems. 2447--2455.
[24]
Balaji Lakshminarayanan, Alexander Pritzel, and Charles Blundell. 2017. Simple and scalable predictive uncertainty estimation using deep ensembles. In Advances in neural information processing systems. 6402--6413.
[25]
Lihong Li, Wei Chu, John Langford, and Robert E Schapire. 2010. A contextual-bandit approach to personalized news article recommendation. In Proceedings of the 19th international conference on World wide web. 661--670.
[26]
Lihong Li, Wei Chu, John Langford, and Xuanhui Wang. 2011. Unbiased offline evaluation of contextual-bandit-based news article recommendation algorithms. In Proceedings of the fourth ACM international conference on Web search and data mining. 297--306.
[27]
Lihong Li, Yu Lu, and Dengyong Zhou. 2017. Provably optimal algorithms for generalized linear contextual bandits. arXiv preprint arXiv:1703.00048 (2017).
[28]
Xin Li and Hsinchun Chen. 2013. Recommendation as link prediction in bipartite graphs: A graph kernel-based machine learning approach. Decision Support Systems, Vol. 54, 2 (2013), 880--890.
[29]
Julian McAuley, Christopher Targett, Qinfeng Shi, and Anton Van Den Hengel. 2015. Image-based recommendations on styles and substitutes. In Proceedings of the 38th international ACM SIGIR conference on research and development in information retrieval. 43--52.
[30]
James McInerney, Benjamin Lacker, Samantha Hansen, Karl Higley, Hugues Bouchard, Alois Gruson, and Rishabh Mehrotra. 2018. Explore, exploit, and explain: personalizing explainable recommendations with bandits. In Proceedings of the 12th ACM Conference on Recommender Systems. 31--39.
[31]
Kevin P Murphy. 2012. Machine learning: a probabilistic perspective .MIT press.
[32]
Roger B Myerson. 1981. Optimal auction design. Mathematics of operations research, Vol. 6, 1 (1981), 58--73.
[33]
Yanru Qu, Han Cai, Kan Ren, Weinan Zhang, Yong Yu, Ying Wen, and Jun Wang. 2016. Product-based neural networks for user response prediction. In 2016 IEEE 16th International Conference on Data Mining (ICDM). IEEE, 1149--1154.
[34]
Filip Radlinski, Robert Kleinberg, and Thorsten Joachims. 2008. Learning diverse rankings with multi-armed bandits. In Proceedings of the 25th international conference on Machine learning. 784--791.
[35]
CE. Rasmussen and CKI. Williams. 2006. Gaussian Processes for Machine Learning .MIT Press, Cambridge, MA, USA. 248 pages.
[36]
Hugh Salimbeni and Marc Deisenroth. 2017. Doubly stochastic variational inference for deep Gaussian processes. In Advances in Neural Information Processing Systems. 4588--4599.
[37]
Jiaxin Shi, Michalis Titsias, and Andriy Mnih. 2020. Sparse orthogonal variational inference for gaussian processes. In International Conference on Artificial Intelligence and Statistics. 1932--1942.
[38]
Niranjan Srinivas, Andreas Krause, Sham M Kakade, and Matthias W Seeger. 2012. Information-theoretic regret bounds for gaussian process optimization in the bandit setting. IEEE Transactions on Information Theory, Vol. 58, 5 (2012), 3250--3265.
[39]
R.S. Sutton and A.G. Barto. 2018. Reinforcement Learning: An Introduction .MIT Press. 2018023826
[40]
Michalis Titsias. 2009. Variational learning of inducing variables in sparse Gaussian processes. In Artificial Intelligence and Statistics. 567--574.
[41]
Hastagiri P Vanchinathan, Isidor Nikolic, Fabio De Bona, and Andreas Krause. 2014. Explore-exploit in top-n recommender systems via gaussian processes. In Proceedings of the 8th ACM Conference on Recommender systems. 225--232.
[42]
Vladimir Vapnik. 2013. The nature of statistical learning theory .Springer science & business media.
[43]
Max Welling and Yee W Teh. 2011. Bayesian learning via stochastic gradient Langevin dynamics. In Proceedings of the 28th international conference on machine learning (ICML-11). 681--688.
[44]
Florian Wenzel, Kevin Roth, Bastiaan S Veeling, Jakub 'Swika tkowski, Linh Tran, Stephan Mandt, Jasper Snoek, Tim Salimans, Rodolphe Jenatton, and Sebastian Nowozin. 2020. How good is the bayes posterior in deep neural networks really? arXiv preprint arXiv:2002.02405 (2020).
[45]
Andrew Gordon Wilson, Zhiting Hu, Ruslan Salakhutdinov, and Eric P Xing. 2016a. Deep kernel learning. In Artificial intelligence and statistics. 370--378.
[46]
Andrew G Wilson, Zhiting Hu, Russ R Salakhutdinov, and Eric P Xing. 2016b. Stochastic variational deep kernel learning. In Advances in Neural Information Processing Systems. 2586--2594.
[47]
Di Wu, Xiujun Chen, Xun Yang, Hao Wang, Qing Tan, Xiaoxun Zhang, Jian Xu, and Kun Gai. 2018. Budget constrained bidding by model-free reinforcement learning in display advertising. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management. 1443--1451.
[48]
Yahoo! [n.d.]. Yahoo! webscope program. https://webscope.sandbox.yahoo.com/. Accessed: 2020--10--19.
[49]
Jun Zhao, Guang Qiu, Ziyu Guan, Wei Zhao, and Xiaofei He. 2018. Deep reinforcement learning for sponsored search real-time bidding. In Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining. 1021--1030.
[50]
Guorui Zhou, Na Mou, Ying Fan, Qi Pi, Weijie Bian, Chang Zhou, Xiaoqiang Zhu, and Kun Gai. 2019. Deep interest evolution network for click-through rate prediction. In Proceedings of the AAAI conference on artificial intelligence, Vol. 33. 5941--5948.
[51]
Guorui Zhou, Xiaoqiang Zhu, Chenru Song, Ying Fan, Han Zhu, Xiao Ma, Yanghui Yan, Junqi Jin, Han Li, and Kun Gai. 2018. Deep interest network for click-through rate prediction. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 1059--1068.
[52]
Han Zhu, Junqi Jin, Chang Tan, Fei Pan, Yifan Zeng, Han Li, and Kun Gai. 2017. Optimized cost per click in taobao display advertising. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2191--2200.
[53]
Han Zhu, Xiang Li, Pengye Zhang, Guozheng Li, Jie He, Han Li, and Kun Gai. 2018. Learning tree-based deep model for recommender systems. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 1079--1088.
[54]
Jingwei Zhuo, Ziru Xu, Wei Dai, Han Zhu, Han Li, Jian Xu, and Kun Gai. 2020. Learning Optimal Tree Models under Beam Search. arXiv preprint arXiv:2006.15408 (2020).

Cited By

View all
  • (2024)Integrating Visual Transformer and Graph Neural Network for Visual Analysis in Digital MarketingJournal of Organizational and End User Computing10.4018/JOEUC.34209236:1(1-28)Online publication date: 9-Apr-2024
  • (2024)Deep Ensemble Shape Calibration: Multi-Field Post-hoc Calibration in Online AdvertisingProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671529(6117-6126)Online publication date: 25-Aug-2024
  • (2024)Uncertainty Estimation in Click-Through Rate Prediction with Deep Neural Gaussian Process2024 IEEE 6th Advanced Information Management, Communicates, Electronic and Automation Control Conference (IMCEC)10.1109/IMCEC59810.2024.10575061(1758-1762)Online publication date: 24-May-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
KDD '21: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining
August 2021
4259 pages
ISBN:9781450383325
DOI:10.1145/3447548
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 August 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Gaussian process
  2. advertising system
  3. click-through rate (ctr)
  4. exploration-exploitation trade-off

Qualifiers

  • Research-article

Conference

KDD '21
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)44
  • Downloads (Last 6 weeks)6
Reflects downloads up to 22 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Integrating Visual Transformer and Graph Neural Network for Visual Analysis in Digital MarketingJournal of Organizational and End User Computing10.4018/JOEUC.34209236:1(1-28)Online publication date: 9-Apr-2024
  • (2024)Deep Ensemble Shape Calibration: Multi-Field Post-hoc Calibration in Online AdvertisingProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671529(6117-6126)Online publication date: 25-Aug-2024
  • (2024)Uncertainty Estimation in Click-Through Rate Prediction with Deep Neural Gaussian Process2024 IEEE 6th Advanced Information Management, Communicates, Electronic and Automation Control Conference (IMCEC)10.1109/IMCEC59810.2024.10575061(1758-1762)Online publication date: 24-May-2024
  • (2024)User Traffic Time Series Uncertainty Estimation with Deep Laplace Ensemble2024 7th International Conference on Artificial Intelligence and Big Data (ICAIBD)10.1109/ICAIBD62003.2024.10604617(72-76)Online publication date: 24-May-2024
  • (2023)Adversarial Constrained Bidding via Minimax Regret Optimization with Causality-Aware Reinforcement LearningProceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3580305.3599254(2314-2325)Online publication date: 6-Aug-2023
  • (2023)Epsilon non-Greedy: A Bandit Approach for Unbiased Recommendation via Uniform Data2023 IEEE International Conference on Data Mining Workshops (ICDMW)10.1109/ICDMW60847.2023.00026(147-156)Online publication date: 4-Dec-2023
  • (2022)ROI-Constrained Bidding via Curriculum-Guided Bayesian Reinforcement LearningProceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3534678.3539211(4021-4031)Online publication date: 14-Aug-2022
  • (2022)Hybrid Transfer in Deep Reinforcement Learning for Ads AllocationProceedings of the 31st ACM International Conference on Information & Knowledge Management10.1145/3511808.3557611(4560-4564)Online publication date: 17-Oct-2022

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media