research-article

Open access

Ad click prediction: a view from the trenches

Authors:

H. Brendan McMahan,

Eugene Davydov,

Daniel Golovin,

Sharat Chikkerur,

Martin Wattenberg,

Arnar Mar Hrafnkelsson,

Jeremy KubicaAuthors Info & Claims

KDD '13: Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining

Pages 1222 - 1230

https://doi.org/10.1145/2487575.2488200

Published: 11 August 2013 Publication History

Abstract

Predicting ad click-through rates (CTR) is a massive-scale learning problem that is central to the multi-billion dollar online advertising industry. We present a selection of case studies and topics drawn from recent experiments in the setting of a deployed CTR prediction system. These include improvements in the context of traditional supervised learning based on an FTRL-Proximal online learning algorithm (which has excellent sparsity and convergence properties) and the use of per-coordinate learning rates.

We also explore some of the challenges that arise in a real-world system that may appear at first to be outside the domain of traditional machine learning research. These include useful tricks for memory savings, methods for assessing and visualizing performance, practical methods for providing confidence estimates for predicted probabilities, calibration methods, and methods for automated management of features. Finally, we also detail several directions that did not turn out to be beneficial for us, despite promising results elsewhere in the literature. The goal of this paper is to highlight the close relationship between theoretical advances and practical engineering in this industrial setting, and to show the depth of challenges that appear when applying traditional machine learning methods in a complex dynamic system.

References

[1]

D. Agarwal, B.-C. Chen, and P. Elango. Spatio-temporal models for estimating click-through rate. In Proceedings of the 18th international conference on World wide web, pages 21--30. ACM, 2009.

Digital Library

[2]

R. Ananthanarayanan, V. Basker, S. Das, A. Gupta, H. Jiang, T. Qiu, A. Reznichenko, D. Ryabkov, M. Singh, and S. Venkataraman. Photon: Fault-tolerant and scalable joining of continuous data streams. In SIGMOD Conference, 2013. To appear.

Digital Library

[3]

R. Bekkerman, M. Bilenko, and J. Langford. Scaling up machine learning: Parallel and distributed approaches. 2011.

Digital Library

[4]

B. H. Bloom. Space/time trade-offs in hash coding with allowable errors. Commun. ACM, 13(7), July 1970.

Digital Library

[5]

A. Blum, A. Kalai, and J. Langford. Beating the hold-out: Bounds for k-fold and progressive cross-validation. In COLT, 1999.

Digital Library

[6]

O. Chapelle. Click modeling for display advertising. In AdML: 2012 ICML Workshop on Online Advertising, 2012.

[7]

C. Cortes, M. Mohri, M. Riley, and A. Rostamizadeh. Sample selection bias correction theory. In ALT, 2008.

Digital Library

[8]

J. Dean, G. S. Corrado, R. Monga, K. Chen, M. Devin, Q. V. Le, M. Z. Mao, M. Ranzato, A. Senior, P. Tucker, K. Yang, and A. Y. Ng. Large scale distributed deep networks. In NIPS, 2012.

Digital Library

[9]

T. G. Dietterich. An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, boosting, and randomization. Machine learning, 40(2):139--157, 2000.

Digital Library

[10]

J. Duchi, E. Hazan, and Y. Singer. Adaptive subgradient methods for online learning and stochastic optimization. In COLT, 2010.

[11]

J. Duchi and Y. Singer. Efficient learning using forward-backward splitting. In Advances in Neural Information Processing Systems 22, pages 495--503. 2009.

[12]

L. Fan, P. Cao, J. Almeida, and A. Broder. Summary cache: a scalable wide-area web cache sharing protocol. IEEE/ACM Transactions on Networking, 8(3), jun 2000.

Digital Library

[13]

T. Fawcett. An introduction to roc analysis. Pattern recognition letters, 27(8):861--874, 2006.

Digital Library

[14]

D. Golovin, D. Sculley, H. B. McMahan, and M. Young. Large-scale learning with a small-scale footprint. In ICML, 2013. To appear.

[15]

T. Graepel, J. Q. Candela, T. Borchert, and R. Herbrich. Web-scale Bayesian click-through rate prediction for sponsored search advertising in microsofts bing search engine. In Proc. 27th Internat. Conf. on Machine Learning, 2010.

Digital Library

[16]

D. Hillard, S. Schroedl, E. Manavoglu, H. Raghavan, and C. Leggetter. Improving ad relevance in sponsored search. In Proceedings of the third ACM international conference on Web search and data mining, WSDM '10, pages 361--370, 2010.

Digital Library

[17]

G. E. Hinton, N. Srivastava, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov. Improving neural networks by preventing co-adaptation of feature detectors. CoRR, abs/1207.0580, 2012.

[18]

D. W. Hosmer and S. Lemeshow. Applied logistic regression. Wiley-Interscience Publication, 2000.

[19]

H. A. Koepke and M. Bilenko. Fast prediction of new feature utility. In ICML, 2012.

[20]

J. Langford, L. Li, and T. Zhang. Sparse online learning via truncated gradient. JMLR, 10, 2009.

Digital Library

[21]

S.-M. Li, M. Mahdian, and R. P. McAfee. Value of learning in sponsored search auctions. In WINE, 2010.

Digital Library

[22]

W. Li, X. Wang, R. Zhang, Y. Cui, J. Mao, and R. Jin. Exploitation and exploration in a performance based contextual advertising system. In KDD, 2010.

Digital Library

[23]

R. Luss, S. Rosset, and M. Shahar. Efficient regularized isotonic regression with application to gene--gene interaction search. Ann. Appl. Stat., 6(1), 2012.

[24]

H. B. McMahan. Follow-the-regularized-leader and mirror descent: Equivalence theorems and L1 regularization. In AISTATS, 2011.

[25]

H. B. McMahan and O. Muralidharan. On calibrated predictions for auction selection mechanisms. CoRR, abs/1211.3955, 2012.

[26]

H. B. McMahan and M. Streeter. Adaptive bound optimization for online convex optimization. In COLT, 2010.

[27]

A. Niculescu-Mizil and R. Caruana. Predicting good probabilities with supervised learning. In ICML, ICML '05, 2005.

Digital Library

[28]

M. Richardson, E. Dominowska, and R. Ragno. Predicting clicks: estimating the click-through rate for new ads. In Proceedings of the 16th international conference on World Wide Web, pages 521--530. ACM, 2007.

Digital Library

[29]

M. J. Streeter and H. B. McMahan. Less regret via online conditioning. CoRR, abs/1002.4862, 2010.

[30]

D. Tang, A. Agarwal, D. O'Brien, and M. Meyer. Overlapping experiment infrastructure: more, better, faster experimentation. In KDD, pages 17--26, 2010.

Digital Library

[31]

K. Weinberger, A. Dasgupta, J. Langford, A. Smola, and J. Attenberg. Feature hashing for large scale multitask learning. In ICML, pages 1113--1120. ACM, 2009.

Digital Library

[32]

L. Xiao. Dual averaging method for regularized stochastic learning and online optimization. In NIPS, 2009.

[33]

Z. A. Zhu, W. Chen, T. Minka, C. Zhu, and Z. Chen. A novel click model and its applications to online advertising. In Proceedings of the third ACM international conference on Web search and data mining, pages 321--330. ACM, 2010.

Digital Library

[34]

M. Zinkevich. Online convex programming and generalized infinitesimal gradient ascent. In ICML, 2003.

Digital Library

Cited By

Hutchinson SAlizadeh M(2024)Safe Online Convex Optimization with First-Order Feedback2024 American Control Conference (ACC)10.23919/ACC60939.2024.10644318(1-7)Online publication date: 10-Jul-2024
https://doi.org/10.23919/ACC60939.2024.10644318
Xin FShen YLu C(2024)Application of a weighted ensemble forecasting method based on online learning in subseasonal forecast in the South ChinaGeoscience Letters10.1186/s40562-024-00319-911:1Online publication date: 30-Jan-2024
https://doi.org/10.1186/s40562-024-00319-9
Aggarwal GBadanidiyuru ABalseiro SBhawalkar KDeng YFeng ZGoel GLiaw CLu HMahdian MMao JMehta AMirrokni VLeme RPerlroth APiliouras GSchneider JSchvartzman ASivan BSpendlove KTeng YWang DZhang HZhao MZhu WZuo S(2024)Auto-Bidding and Auctions in Online Advertising: A SurveyACM SIGecom Exchanges10.1145/3699824.369983822:1(159-183)Online publication date: 1-Jun-2024
https://dl.acm.org/doi/10.1145/3699824.3699838
Show More Cited By

Index Terms

Ad click prediction: a view from the trenches
1. Computing methodologies
  1. Machine learning

Recommendations

Cost-per-Click Pricing for Display Advertising

Display advertising is a $25 billion business with a promising upward revenue trend. In this paper, we consider an online display advertising setting in which a web publisher posts display ads on its website and charges based on the cost-per-click ...
Improving click-through rate prediction accuracy in online advertising by transfer learning
WI '17: Proceedings of the International Conference on Web Intelligence

As the main revenue source of Internet companies, online advertising is always a significant topic, where click-through rate (CTR) prediction plays a central role. In online advertising systems, there are often many advertisement products. Due to the ...
Click-through rate prediction in online advertising: A literature review
Highlights
- We make a comprehensive literature review on state-of-the-art and latest CTR prediction research, with a special focus on modeling frameworks.
- We give a classification of state-of-the-art CTR prediction models in the extant literature, ...
Abstract
Predicting the probability that a user will click on a specific advertisement has been a prevalent issue in online advertising, attracting much research attention in the past decades. As a hot research frontier driven by industrial needs, recent ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

KDD '13: Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining

August 2013

1534 pages

ISBN:9781450321747

DOI:10.1145/2487575

Editors:
Rayid Ghani
University of Chicago
,
Ted E. Senator
SAIC
,
Paul Bradley
MethodCare, Inc.
,
Rajesh Parekh
Groupon
,
Jingrui He
Stevens Institute of Technology
,
General Chairs:
Robert L. Grossman
University of Chicago and Open Data Group
,
Ramasamy Uthurusamy
General Motors Corporation (retired)
,
Program Chairs:
Inderjit S. Dhillon
University of Texas
,
Yehuda Koren
Google

Copyright © 2013 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 August 2013

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

KDD' 13

Sponsor:

KDD' 13: The 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

August 11 - 14, 2013

Illinois, Chicago, USA

Acceptance Rates

KDD '13 Paper Acceptance Rate 125 of 726 submissions, 17%;

Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

586
Total Citations
View Citations
14,472
Total Downloads

Downloads (Last 12 months)2,720
Downloads (Last 6 weeks)199

Reflects downloads up to 12 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Hutchinson SAlizadeh M(2024)Safe Online Convex Optimization with First-Order Feedback2024 American Control Conference (ACC)10.23919/ACC60939.2024.10644318(1-7)Online publication date: 10-Jul-2024
https://doi.org/10.23919/ACC60939.2024.10644318
Xin FShen YLu C(2024)Application of a weighted ensemble forecasting method based on online learning in subseasonal forecast in the South ChinaGeoscience Letters10.1186/s40562-024-00319-911:1Online publication date: 30-Jan-2024
https://doi.org/10.1186/s40562-024-00319-9
Aggarwal GBadanidiyuru ABalseiro SBhawalkar KDeng YFeng ZGoel GLiaw CLu HMahdian MMao JMehta AMirrokni VLeme RPerlroth APiliouras GSchneider JSchvartzman ASivan BSpendlove KTeng YWang DZhang HZhao MZhu WZuo S(2024)Auto-Bidding and Auctions in Online Advertising: A SurveyACM SIGecom Exchanges10.1145/3699824.369983822:1(159-183)Online publication date: 1-Jun-2024
https://dl.acm.org/doi/10.1145/3699824.3699838
Al Jurdi WAbdo JDemerjian JMakhoul A(2024)Group Validation in Recommender Systems: Framework for Multi-layer Performance EvaluationACM Transactions on Recommender Systems10.1145/36408202:1(1-25)Online publication date: 19-Jan-2024
https://dl.acm.org/doi/10.1145/3640820
Huang JZhang LWang JJiang SHuang DDing CXu L(2024)Utilizing Non-click Samples via Semi-supervised Learning for Conversion Rate PredictionProceedings of the 18th ACM Conference on Recommender Systems10.1145/3640457.3688151(350-359)Online publication date: 8-Oct-2024
https://dl.acm.org/doi/10.1145/3640457.3688151
Rohde D(2024)Why the Shooting in the Dark Method Dominates Recommender Systems PracticeProceedings of the 18th ACM Conference on Recommender Systems10.1145/3640457.3688029(847-849)Online publication date: 8-Oct-2024
https://dl.acm.org/doi/10.1145/3640457.3688029
Pan JXue WWang XYu HLiu XQuan SQiu XLiu DXiao LJiang JBaeza-Yates RBonchi F(2024)Ads Recommendation in a Collapsed and Entangled WorldProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671607(5566-5577)Online publication date: 25-Aug-2024
https://dl.acm.org/doi/10.1145/3637528.3671607
Tang JGao HHe LKatariya SBaeza-Yates RBonchi F(2024)Multi-objective Learning to Rank by Model DistillationProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671597(5783-5792)Online publication date: 25-Aug-2024
https://dl.acm.org/doi/10.1145/3637528.3671597
Lin ZPan JZhang SWang XXiao XHuang SXiao LJiang JBaeza-Yates RBonchi F(2024)Understanding the Ranking Loss for Recommendation with Sparse User FeedbackProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671565(5409-5418)Online publication date: 25-Aug-2024
https://dl.acm.org/doi/10.1145/3637528.3671565
Brahmbhatt APokala MSaket RRaghuveer ASerra ESpezzano F(2024)LLP-Bench: A Large Scale Tabular Benchmark for Learning from Label ProportionsProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3680032(4374-4381)Online publication date: 21-Oct-2024
https://dl.acm.org/doi/10.1145/3627673.3680032
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents