research-article

Exploration in Online Advertising Systems with Deep Uncertainty-Aware Learning

Authors:

Kuang-Chih LeeAuthors Info & Claims

KDD '21: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining

Pages 2792 - 2801

https://doi.org/10.1145/3447548.3467089

Published: 14 August 2021 Publication History

Abstract

Modern online advertising systems inevitably rely on personalization methods, such as click-through rate (CTR) prediction. Recent progress in CTR prediction enjoys the rich representation capabilities of deep learning and achieves great success in large-scale industrial applications. However, these methods can suffer from lack of exploration. Another line of prior work addresses the exploration-exploitation trade-off problem with contextual bandit methods, which are recently less studied in the industry due to the difficulty in extending their flexibility with deep models. In this paper, we propose a novel Deep Uncertainty-Aware Learning (DUAL) method to learn CTR models based on Gaussian processes, which can provide predictive uncertainty estimations while maintaining the flexibility of deep neural networks. DUAL can be easily implemented on existing models and deployed in real-time systems with minimal extra computational overhead. By linking the predictive uncertainty estimation ability of DUAL to well-known bandit algorithms, we further present DUAL-based Ad-ranking strategies to boost up long-term utilities such as the social welfare in advertising systems. Experimental results on several public datasets demonstrate the effectiveness of our methods. Remarkably, an online A/B test deployed in the Alibaba display advertising platform shows an 8.2% social welfare improvement and an 8.0% revenue lift.

Supplementary Material

MP4 File (KDD21-ads1760.mp4)

Presentation video

Download
32.98 MB

References

[1]

Peter Auer. 2002. Using confidence bounds for exploitation-exploration trade-offs. Journal of Machine Learning Research, Vol. 3, Nov (2002), 397--422.

Digital Library

[2]

Marko Balabanović. 1998. Exploring versus exploiting when learning user models for text recommendation. User Modeling and User-Adapted Interaction, Vol. 8, 1--2 (1998), 71--102.

Digital Library

[3]

Charles Blundell, Julien Cornebise, Koray Kavukcuoglu, and Daan Wierstra. 2015. Weight uncertainty in neural networks. arXiv preprint arXiv:1505.05424 (2015).

Digital Library

[4]

Léon Bottou. 2010. Large-scale machine learning with stochastic gradient descent. In Proceedings of COMPSTAT'2010. Springer, 177--186.

[5]

Andrei Z Broder. 2008. Computational advertising and recommender systems. In Proceedings of the 2008 ACM conference on Recommender systems. 1--2.

Digital Library

[6]

Anupam Chander. 2016. The racist algorithm. Mich. L. Rev., Vol. 115 (2016), 1023.

[7]

Allison JB Chaney, Brandon M Stewart, and Barbara E Engelhardt. 2018. How algorithmic confounding in recommendation systems increases homogeneity and decreases utility. In Proceedings of the 12th ACM Conference on Recommender Systems. 224--232.

Digital Library

[8]

Olivier Chapelle and Lihong Li. 2011. An empirical evaluation of thompson sampling. In Advances in neural information processing systems. 2249--2257.

[9]

Heng-Tze Cheng, Levent Koc, Jeremiah Harmsen, Tal Shaked, Tushar Chandra, Hrishi Aradhye, Glen Anderson, Greg Corrado, Wei Chai, Mustafa Ispir, et al. 2016. Wide & deep learning for recommender systems. In Proceedings of the 1st workshop on deep learning for recommender systems. 7--10.

Digital Library

[10]

Wei Chu, Lihong Li, Lev Reyzin, and Robert Schapire. 2011. Contextual bandits with linear payoff functions. In Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics. 208--214.

[11]

Bianca Dumitrascu, Karen Feng, and Barbara Engelhardt. 2018. PG-TS: Improved Thompson sampling for logistic contextual bandits. In Advances in neural information processing systems. 4624--4633.

[12]

Benjamin Edelman, Michael Ostrovsky, and Michael Schwarz. 2007. Internet advertising and the generalized second-price auction: Selling billions of dollars worth of keywords. American economic review, Vol. 97, 1 (2007), 242--259.

[13]

Yufei Feng, Fuyu Lv, Weichen Shen, Menghan Wang, Fei Sun, Yu Zhu, and Keping Yang. 2019. Deep session interest network for click-through rate prediction. arXiv preprint arXiv:1905.06482 (2019).

[14]

Sarah Filippi, Olivier Cappe, Aurélien Garivier, and Csaba Szepesvári. 2010. Parametric bandits: The generalized linear case. In Advances in Neural Information Processing Systems. 586--594.

[15]

Yarin Gal and Zoubin Ghahramani. 2016. Dropout as a bayesian approximation: Representing model uncertainty in deep learning. In international conference on machine learning. 1050--1059.

Digital Library

[16]

Alex Graves. 2011. Practical variational inference for neural networks. In Advances in neural information processing systems. 2348--2356.

[17]

Dalin Guo, Sofia Ira Ktena, Ferenc Huszar, Pranay Kumar Myana, Wenzhe Shi, and Alykhan Tejani. 2020. Deep Bayesian Bandits: Exploring in Online Personalized Recommendations. arxiv: 2008.00727 [cs.LG]

[18]

Huifeng Guo, Ruiming Tang, Yunming Ye, Zhenguo Li, and Xiuqiang He. 2017. DeepFM: a factorization-machine based neural network for CTR prediction. arXiv preprint arXiv:1703.04247 (2017).

[19]

Xiangnan He, Lizi Liao, Hanwang Zhang, Liqiang Nie, Xia Hu, and Tat-Seng Chua. 2017. Neural collaborative filtering. In Proceedings of the 26th international conference on world wide web. 173--182.

Digital Library

[20]

James Hensman, Nicolo Fusi, and Neil D Lawrence. 2013. Gaussian processes for big data. arXiv preprint arXiv:1309.6835 (2013).

[21]

James Hensman, Alexander Matthews, and Zoubin Ghahramani. 2015. Scalable variational Gaussian process classification. (2015).

[22]

Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).

[23]

Andreas Krause and Cheng S Ong. 2011. Contextual gaussian process bandit optimization. In Advances in neural information processing systems. 2447--2455.

[24]

Balaji Lakshminarayanan, Alexander Pritzel, and Charles Blundell. 2017. Simple and scalable predictive uncertainty estimation using deep ensembles. In Advances in neural information processing systems. 6402--6413.

[25]

Lihong Li, Wei Chu, John Langford, and Robert E Schapire. 2010. A contextual-bandit approach to personalized news article recommendation. In Proceedings of the 19th international conference on World wide web. 661--670.

Digital Library

[26]

Lihong Li, Wei Chu, John Langford, and Xuanhui Wang. 2011. Unbiased offline evaluation of contextual-bandit-based news article recommendation algorithms. In Proceedings of the fourth ACM international conference on Web search and data mining. 297--306.

Digital Library

[27]

Lihong Li, Yu Lu, and Dengyong Zhou. 2017. Provably optimal algorithms for generalized linear contextual bandits. arXiv preprint arXiv:1703.00048 (2017).

Digital Library

[28]

Xin Li and Hsinchun Chen. 2013. Recommendation as link prediction in bipartite graphs: A graph kernel-based machine learning approach. Decision Support Systems, Vol. 54, 2 (2013), 880--890.

Digital Library

[29]

Julian McAuley, Christopher Targett, Qinfeng Shi, and Anton Van Den Hengel. 2015. Image-based recommendations on styles and substitutes. In Proceedings of the 38th international ACM SIGIR conference on research and development in information retrieval. 43--52.

Digital Library

[30]

James McInerney, Benjamin Lacker, Samantha Hansen, Karl Higley, Hugues Bouchard, Alois Gruson, and Rishabh Mehrotra. 2018. Explore, exploit, and explain: personalizing explainable recommendations with bandits. In Proceedings of the 12th ACM Conference on Recommender Systems. 31--39.

Digital Library

[31]

Kevin P Murphy. 2012. Machine learning: a probabilistic perspective .MIT press.

Digital Library

[32]

Roger B Myerson. 1981. Optimal auction design. Mathematics of operations research, Vol. 6, 1 (1981), 58--73.

[33]

Yanru Qu, Han Cai, Kan Ren, Weinan Zhang, Yong Yu, Ying Wen, and Jun Wang. 2016. Product-based neural networks for user response prediction. In 2016 IEEE 16th International Conference on Data Mining (ICDM). IEEE, 1149--1154.

[34]

Filip Radlinski, Robert Kleinberg, and Thorsten Joachims. 2008. Learning diverse rankings with multi-armed bandits. In Proceedings of the 25th international conference on Machine learning. 784--791.

Digital Library

[35]

CE. Rasmussen and CKI. Williams. 2006. Gaussian Processes for Machine Learning .MIT Press, Cambridge, MA, USA. 248 pages.

Digital Library

[36]

Hugh Salimbeni and Marc Deisenroth. 2017. Doubly stochastic variational inference for deep Gaussian processes. In Advances in Neural Information Processing Systems. 4588--4599.

[37]

Jiaxin Shi, Michalis Titsias, and Andriy Mnih. 2020. Sparse orthogonal variational inference for gaussian processes. In International Conference on Artificial Intelligence and Statistics. 1932--1942.

[38]

Niranjan Srinivas, Andreas Krause, Sham M Kakade, and Matthias W Seeger. 2012. Information-theoretic regret bounds for gaussian process optimization in the bandit setting. IEEE Transactions on Information Theory, Vol. 58, 5 (2012), 3250--3265.

Digital Library

[39]

R.S. Sutton and A.G. Barto. 2018. Reinforcement Learning: An Introduction .MIT Press. 2018023826

Digital Library

[40]

Michalis Titsias. 2009. Variational learning of inducing variables in sparse Gaussian processes. In Artificial Intelligence and Statistics. 567--574.

[41]

Hastagiri P Vanchinathan, Isidor Nikolic, Fabio De Bona, and Andreas Krause. 2014. Explore-exploit in top-n recommender systems via gaussian processes. In Proceedings of the 8th ACM Conference on Recommender systems. 225--232.

Digital Library

[42]

Vladimir Vapnik. 2013. The nature of statistical learning theory .Springer science & business media.

Digital Library

[43]

Max Welling and Yee W Teh. 2011. Bayesian learning via stochastic gradient Langevin dynamics. In Proceedings of the 28th international conference on machine learning (ICML-11). 681--688.

[44]

Florian Wenzel, Kevin Roth, Bastiaan S Veeling, Jakub 'Swika tkowski, Linh Tran, Stephan Mandt, Jasper Snoek, Tim Salimans, Rodolphe Jenatton, and Sebastian Nowozin. 2020. How good is the bayes posterior in deep neural networks really? arXiv preprint arXiv:2002.02405 (2020).

[45]

Andrew Gordon Wilson, Zhiting Hu, Ruslan Salakhutdinov, and Eric P Xing. 2016a. Deep kernel learning. In Artificial intelligence and statistics. 370--378.

[46]

Andrew G Wilson, Zhiting Hu, Russ R Salakhutdinov, and Eric P Xing. 2016b. Stochastic variational deep kernel learning. In Advances in Neural Information Processing Systems. 2586--2594.

[47]

Di Wu, Xiujun Chen, Xun Yang, Hao Wang, Qing Tan, Xiaoxun Zhang, Jian Xu, and Kun Gai. 2018. Budget constrained bidding by model-free reinforcement learning in display advertising. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management. 1443--1451.

Digital Library

[48]

Yahoo! [n.d.]. Yahoo! webscope program. https://webscope.sandbox.yahoo.com/. Accessed: 2020--10--19.

[49]

Jun Zhao, Guang Qiu, Ziyu Guan, Wei Zhao, and Xiaofei He. 2018. Deep reinforcement learning for sponsored search real-time bidding. In Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining. 1021--1030.

Digital Library

[50]

Guorui Zhou, Na Mou, Ying Fan, Qi Pi, Weijie Bian, Chang Zhou, Xiaoqiang Zhu, and Kun Gai. 2019. Deep interest evolution network for click-through rate prediction. In Proceedings of the AAAI conference on artificial intelligence, Vol. 33. 5941--5948.

Digital Library

[51]

Guorui Zhou, Xiaoqiang Zhu, Chenru Song, Ying Fan, Han Zhu, Xiao Ma, Yanghui Yan, Junqi Jin, Han Li, and Kun Gai. 2018. Deep interest network for click-through rate prediction. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 1059--1068.

Digital Library

[52]

Han Zhu, Junqi Jin, Chang Tan, Fei Pan, Yifan Zeng, Han Li, and Kun Gai. 2017. Optimized cost per click in taobao display advertising. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2191--2200.

Digital Library

[53]

Han Zhu, Xiang Li, Pengye Zhang, Guozheng Li, Jie He, Han Li, and Kun Gai. 2018. Learning tree-based deep model for recommender systems. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 1079--1088.

Digital Library

[54]

Jingwei Zhuo, Ziru Xu, Wei Dai, Han Zhu, Han Li, Jian Xu, and Kun Gai. 2020. Learning Optimal Tree Models under Beam Search. arXiv preprint arXiv:2006.15408 (2020).

Cited By

Chao YZhu HZhou Y(2024)Integrating Visual Transformer and Graph Neural Network for Visual Analysis in Digital MarketingJournal of Organizational and End User Computing10.4018/JOEUC.34209236:1(1-28)Online publication date: 9-Apr-2024
https://doi.org/10.4018/JOEUC.342092
Yang SYang HZou ZXu LYuan SZeng YBaeza-Yates RBonchi F(2024)Deep Ensemble Shape Calibration: Multi-Field Post-hoc Calibration in Online AdvertisingProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671529(6117-6126)Online publication date: 25-Aug-2024
https://dl.acm.org/doi/10.1145/3637528.3671529
Fan RYe YZhang LLiu CZhang KChan W(2024)Uncertainty Estimation in Click-Through Rate Prediction with Deep Neural Gaussian Process2024 IEEE 6th Advanced Information Management, Communicates, Electronic and Automation Control Conference (IMCEC)10.1109/IMCEC59810.2024.10575061(1758-1762)Online publication date: 24-May-2024
https://doi.org/10.1109/IMCEC59810.2024.10575061
Show More Cited By

Index Terms

Exploration in Online Advertising Systems with Deep Uncertainty-Aware Learning
1. Computing methodologies
  1. Machine learning
    1. Machine learning approaches
      1. Kernel methods
        Gaussian processes
2. Information systems
  1. Information retrieval
    1. Users and interactive retrieval
      1. Personalization
  2. World Wide Web
    1. Online advertising

Recommendations

Online Advertising: Experimental Facts on Ethics, Involvement, and Product Type

The purpose of this chapter is to provide some insights into advertisements on the Iranian websites. Firstly, in publisher side, is the ethic a matter of fact in accepting Internet advertisements to publish? Second, to provide a preliminary insight into ...
Online Display Advertising: Targeting and Obtrusiveness

We use data from a large-scale field experiment to explore what influences the effectiveness of online advertising. We find that matching an ad to website content and increasing an ad's obtrusiveness independently increase purchase intent. However, in ...
Learning in Online Advertising
This paper investigates how incentives to learn ad performance affect the advertisers’ bidding strategies, as well as the publisher’s optimal mechanism.
Prior literature on pay-per-click advertising assumes that publishers know advertisers’ click-through rates (CTRs). This information, however, is not available when a new advertiser first joins a publisher. The new advertiser’s CTR can be learned only if ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

KDD '21: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining

August 2021

4259 pages

ISBN:9781450383325

DOI:10.1145/3447548

General Chairs:
Feida Zhu
Singapore Management University
,
Beng Chin Ooi
National University of Singapore
,
Chunyan Miao
Nanyang Technology University
,
Program Chairs:
Haixun Wang,
Iryna Skrypnyk,
Wynne Hsu,
Sanjay Chawla

Copyright © 2021 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 August 2021

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

KDD '21

Sponsor:

KDD '21: The 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 14 - 18, 2021

Virtual Event, Singapore

Acceptance Rates

Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

8
Total Citations
View Citations
537
Total Downloads

Downloads (Last 12 months)44
Downloads (Last 6 weeks)6

Reflects downloads up to 22 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

Chao YZhu HZhou Y(2024)Integrating Visual Transformer and Graph Neural Network for Visual Analysis in Digital MarketingJournal of Organizational and End User Computing10.4018/JOEUC.34209236:1(1-28)Online publication date: 9-Apr-2024
https://doi.org/10.4018/JOEUC.342092
Yang SYang HZou ZXu LYuan SZeng YBaeza-Yates RBonchi F(2024)Deep Ensemble Shape Calibration: Multi-Field Post-hoc Calibration in Online AdvertisingProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671529(6117-6126)Online publication date: 25-Aug-2024
https://dl.acm.org/doi/10.1145/3637528.3671529
Fan RYe YZhang LLiu CZhang KChan W(2024)Uncertainty Estimation in Click-Through Rate Prediction with Deep Neural Gaussian Process2024 IEEE 6th Advanced Information Management, Communicates, Electronic and Automation Control Conference (IMCEC)10.1109/IMCEC59810.2024.10575061(1758-1762)Online publication date: 24-May-2024
https://doi.org/10.1109/IMCEC59810.2024.10575061
Fan RYe YZhang LLiu CZhang KChan W(2024)User Traffic Time Series Uncertainty Estimation with Deep Laplace Ensemble2024 7th International Conference on Artificial Intelligence and Big Data (ICAIBD)10.1109/ICAIBD62003.2024.10604617(72-76)Online publication date: 24-May-2024
https://doi.org/10.1109/ICAIBD62003.2024.10604617
Wang HDu CFang PHe LWang LZheng BSingh ASun YAkoglu LGunopulos DYan XKumar ROzcan FYe J(2023)Adversarial Constrained Bidding via Minimax Regret Optimization with Causality-Aware Reinforcement LearningProceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3580305.3599254(2314-2325)Online publication date: 6-Aug-2023
https://dl.acm.org/doi/10.1145/3580305.3599254
Sani SHosseini SRabiee H(2023)Epsilon non-Greedy: A Bandit Approach for Unbiased Recommendation via Uniform Data2023 IEEE International Conference on Data Mining Workshops (ICDMW)10.1109/ICDMW60847.2023.00026(147-156)Online publication date: 4-Dec-2023
https://doi.org/10.1109/ICDMW60847.2023.00026
Wang HDu CFang PYuan SHe XWang LZheng BZhang ARangwala H(2022)ROI-Constrained Bidding via Curriculum-Guided Bayesian Reinforcement LearningProceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3534678.3539211(4021-4031)Online publication date: 14-Aug-2022
https://dl.acm.org/doi/10.1145/3534678.3539211
Wang ZLiao GShi XWu XZhang CZhu BWang YWang XWang DAl Hasan MXiong L(2022)Hybrid Transfer in Deep Reinforcement Learning for Ads AllocationProceedings of the 31st ACM International Conference on Information & Knowledge Management10.1145/3511808.3557611(4560-4564)Online publication date: 17-Oct-2022
https://dl.acm.org/doi/10.1145/3511808.3557611

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents