Response Prediction for Low-Regret Agents

Saeed Alaei¹¹,
Ashwinkumar Badanidiyuru¹¹,
Mohammad Mahdian¹¹ &
…
Sadra Yazdanbod¹¹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11920))

Included in the following conference series:

International Conference on Web and Internet Economics

1230 Accesses
6 Altmetric

Abstract

Companies like Google and Microsoft run billions of auctions every day to sell advertising opportunities. Any change to the rules of these auctions can have a tremendous effect on the revenue of the company and the welfare of the advertisers and the users. Therefore, any change requires careful evaluation of its potential impacts. Currently, such impacts are often evaluated by running simulations or small controlled experiments. This, however, misses the important factor that the advertisers respond to changes. Our goal is to build a theoretical framework for predicting the actions of an agent (the advertiser) that is optimizing her actions in an uncertain environment. We model this problem using a variant of the multi-armed bandit setting where playing an arm is costly. The cost of each arm changes over time and is publicly observable. The value of playing an arm is drawn stochastically from a static distribution and is observed by the agent and not by us. We, however, observe the actions of the agent. Our main result is that assuming the agent is playing a strategy with a regret of at most f(T) within the first T rounds, we can learn to play the multi-armed bandits game (without observing the rewards) in such a way that the regret of our selected actions is at most $O(k^4(f(T)+1)\log (T))$, where k is the number of arms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 74.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Stochastic One-Sided Full-Information Bandit

Asymptotically optimal algorithms for budgeted multiple play bandits

Article 16 May 2019

Implicitly normalized forecaster with clipping for linear and non-linear heavy-tailed multi-armed bandits

Article 29 January 2024

Notes

1.
See [15] for an attempt to solve this problem by restricting the experiment to small micro-markets. Note that this has the obvious disadvantage of biasing the experiment toward a non-representative set of advertisers and auctions.
2.
See [9] for an interesting theoretical treatment of this setting. It turns out that assuming that the advertisers are fully rational and react even to a small change in the auction, even treating a small percentage of each advertiser’s auctions is enough to extrapolate their response to a full treatment. In practice, however, there is too much noise and fluctuation in the system for advertisers to be able to observe and respond to a change that, for example, increases their cost per click by 10% in 1% of their auctions.
3.
In our motivating application, the reward can be the profit the advertiser makes if the user clicks on their ad and makes a purchase, or zero otherwise. In this case, the assumption that the reward distribution is static means that the profit per conversion and the conversion probability are fixed over time. This is not entirely accurate, but is a reasonable approximation of the reality, since while these parameters change over time, they tend to change at a slow pace.

References

Athey, S., Nekipelov, D.: A structural model of sponsored search advertising auctions. In: Sixth Ad Auctions Workshop (2010)
Google Scholar
Athey, S., Haile, P.A.: Nonparametric approaches to auctions. In: Handbook of Econometrics. Elsevier (2007)
Google Scholar
Backstrom, L., Kleinberg, J.: Network bucket testing. In: Proceedings of the 20th International Conference on World Wide Web, pp. 615–624. ACM (2011)
Google Scholar
Braverman, M., Mao, J., Schneider, J., Weinberg, M.: Selling to a no-regret buyer. In: Proceedings of the 2018 ACM Conference on Economics and Computation, pp. 523–538 (2018)
Google Scholar
Broder, A., Gabrilovich, E., Josifovski, V., Mavromatis, G., Smola, A.: Bid generation for advanced match in sponsored search. In: Proceedings of the Fourth ACM International Conference on Web Search and Data Mining, New York, NY, USA, pp. 515–524 (2011)
Google Scholar
Bubeck, S., Cesa-Bianchi, N.: Regret analysis of stochastic and nonstochastic multi-armed bandit problems. CoRR abs/1204.5721 (2012)
Article Google Scholar
Campo, S., Guerre, E., Perrigne, I., Vuong, Q.: Semiparametric Estimation of First-price Auctions with Risk Averse Bidders. Working papers, Centre de Recherche en Economie et Statistique (2003)
Google Scholar
Cary, M., et al.: Greedy bidding strategies for keyword auctions. In: Proceedings of the 8th ACM Conference on Electronic Commerce, EC 2007, pp. 262–271. ACM, New York (2007). https://doi.org/10.1145/1250910.1250949
Chawla, S., Hartline, J., Nekipelov, D.: Mechanism design for data science. In: Proceedings of the Fifteenth ACM Conference on Economics and Computation, pp. 711–712. ACM (2014)
Google Scholar
Feldman, M., Kalai, A., Tennenholtz, M.: Playing games without observing payoffs. In: ICS, pp. 106–110 (2010)
Google Scholar
Guerre, E., Perrigne, I., Vuong, Q.: Optimal nonparametric estimation of first-price auctions. Econometrica 68(3), 525–574 (2000)
Article MathSciNet Google Scholar
Guerre, E., Perrigne, I., Vuong, Q.: Nonparametric identification of risk aversion in first-price auctions under exclusion restrictions. Econometrica 77(4), 1193–1227 (2009)
Article MathSciNet Google Scholar
Jofre-Bonet, M., Pesendorfer, M.: Estimation of a dynamic auction game. Econometrica 71(5), 1443–1489 (2003)
Article MathSciNet Google Scholar
Kuleshov, V., Schrijvers, O.: Inverse game theory: learning utilities in succinct games. In: Markakis, E., Schäfer, G. (eds.) WINE 2015. LNCS, vol. 9470, pp. 413–427. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-662-48995-6_30
Chapter Google Scholar
Lang, K.J., Andersen, R.: Finding dense and isolated submarkets in a sponsored search spending graph. In: Proceedings of the Sixteenth ACM Conference on Conference on Information and Knowledge Management, pp. 613–622. ACM (2007)
Google Scholar
Mizuta, H., Steiglitz, K.: Agent-based simulation of dynamic online auctions. In: 2000 Winter Simulation Conference Proceedings, vol. 2, pp. 1772–1777. IEEE (2000)
Google Scholar
Nekipelov, D., Syrgkanis, V., Tardos, E.: Econometrics for learning agents. In: Proceedings of the Sixteenth ACM Conference on Economics and Computation, pp. 1–18. ACM (2015)
Google Scholar
Pin, F., Key, P.: Stochastic variability in sponsored search auctions: observations and models. In: Proceedings of the 12th ACM Conference on Electronic Commerce, pp. 61–70 (2011)
Google Scholar
Tang, D., Agarwal, A., O’Brien, D., Meyer, M.: Overlapping experiment infrastructure: more, better, faster experimentation. In: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 17–26. ACM (2010)
Google Scholar
Varian, H.R.: Position auctions. Int. J. Ind. Organ. 25(6), 1163–1178 (2007)
Article Google Scholar
Xu, H., Gao, B., Yang, D., Liu, T.Y.: Predicting advertiser bidding behaviors in sponsored search by rationality modeling. In: Proceedings of the 22nd International Conference on World Wide Web, May 2013
Google Scholar

Download references

Author information

Authors and Affiliations

Google, Mountain View, USA
Saeed Alaei, Ashwinkumar Badanidiyuru, Mohammad Mahdian & Sadra Yazdanbod

Authors

Saeed Alaei
View author publications
You can also search for this author in PubMed Google Scholar
Ashwinkumar Badanidiyuru
View author publications
You can also search for this author in PubMed Google Scholar
Mohammad Mahdian
View author publications
You can also search for this author in PubMed Google Scholar
Sadra Yazdanbod
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ashwinkumar Badanidiyuru .

Editor information

Editors and Affiliations

University of Patras, Rio, Greece
Ioannis Caragiannis
Google Research New York, New York, NY, USA
Vahab Mirrokni
The University of Texas System, Austin, TX, USA
Evdokia Nikolova

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Alaei, S., Badanidiyuru, A., Mahdian, M., Yazdanbod, S. (2019). Response Prediction for Low-Regret Agents. In: Caragiannis, I., Mirrokni, V., Nikolova, E. (eds) Web and Internet Economics. WINE 2019. Lecture Notes in Computer Science(), vol 11920. Springer, Cham. https://doi.org/10.1007/978-3-030-35389-6_3

Download citation

DOI: https://doi.org/10.1007/978-3-030-35389-6_3
Published: 03 December 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-35388-9
Online ISBN: 978-3-030-35389-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics