article

Smoothing Techniques for Computing Nash Equilibria of Sequential Games

Authors:

Tuomas SandholmAuthors Info & Claims

Mathematics of Operations Research, Volume 35, Issue 2

Pages 494 - 512

https://doi.org/10.1287/moor.1100.0452

Published: 01 May 2010 Publication History

Abstract

We develop first-order smoothing techniques for saddle-point problems that arise in finding a Nash equilibrium of sequential games. The crux of our work is a construction of suitable prox-functions for a certain class of polytopes that encode the sequential nature of the game. We also introduce heuristics that significantly speed up the algorithm, and decomposed game representations that reduce the memory requirements, enabling the application of the techniques to drastically larger games. An implementation based on our smoothing techniques computes approximate Nash equilibria for games that are more than four orders of magnitude larger than what prior approaches can handle. Finally, we show near-linear further speedups from parallelization.

References

[1]

Billings, D., Peòa, L., Schaeffer, J. and Szafron, D., "The challenge of poker," Artificial Intelligence, v134, pp. 201-240, 2002.

Digital Library

[2]

Billings, D., Burch, N., Davidson, A., Holte, R., Schaeffer, J., Schauenberg, T. and Szafron, D., "Approximating game-theoretic optimal strategies for full-scale poker," Proc. 18th Internat. Joint Conf. Artificial Intelligence (IJCAI), 2003.

Digital Library

[3]

Gilpin, A., "Algorithms for abstracting and solving imperfect information games," 2009.

[4]

Gilpin, A. and Sandholm, T., "A competitive Texas Hold'em poker player via automated abstraction and real-time equilibrium computation," Proc. National Conf. Artificial Intelligence (AAAI), 2006.

[5]

Gilpin, A. and Sandholm, T., "Lossless abstraction method for sequential games of imperfect information," J. ACM, v54, 2007.

[6]

Gilpin, A., Sandholm, T. and Sørensen, T. B., "Potential-aware automated abstraction of sequential games, and holistic equilibrium analysis of Texas Hold'em poker," Proc. National Conf. Artificial Intelligence (AAAI), 2007.

[7]

Gilpin, A., Sandholm, T. and Sørensen, T. B., "A heads-up no-limit Texas Hold'em poker player: Discretized betting models and automatically generated equilibrium-finding programs," Internat. Conf. Autonomous Agents and Multi-Agent Systems (AAMAS), 2008.

[8]

Goffin, J.-L., "On the convergence rate of subgradient optimization methods," Math. Programming, v13, pp. 329-347, 1977.

Digital Library

[9]

Hirriart-Urruty, J. and Lemaréchal, C., Fundamentals of Convex Analysis, Springer-Verlag, Berlin, 2001.

[10]

Juditsky, A., Lan, G., Nemirovski, A. and Shapiro, A., "Stochastic approximation approach to stochastic programming," SIAM J. Optim., v19, pp. 1574-1609, 2009.

Digital Library

[11]

Koller, D., Megiddo, N. and von Stengel, B., "Efficient computation of equilibria for extensive two-person games," Games Econom. Behavior, v14, pp. 247-259, 1996.

[12]

Lan, G., Lu, Z. and Monteiro, R., "Primal-dual first-order methods with O(1/ε) iteration-complexity for cone programming," Math. Programming., 2010.

[13]

Nemirovski, A., "Prox-method with rate of convergence O(1/t) for variational inequalities with Lipschitz-continuous monotone operators and smooth convex-concave saddle point problems," SIAM J. Optim., v15, pp. 229-251, 2004.

Digital Library

[14]

Nesterov, Y., "A method for unconstrained convex minimization problem with rate of convergence O(1/k2)," Doklady AN SSSR (In Russian), v269, pp. 543-547, 1983.

[15]

Nesterov, Y., "Introductory lectures on convex optimization: A basic course," Applied Optimization, Kluwer Academic Publishers, Boston, 2004.

[16]

Nesterov, Y., "Excessive gap technique in nonsmooth convex minimization," SIAM J. Optim., v16, pp. 235-249, 2005.

Digital Library

[17]

Nesterov, Y., "Smooth minimization of non-smooth functions," Math. Programming, v103, pp. 127-152, 2005.

Digital Library

[18]

Osborne, M. and Rubinstein, A., A Course in Game Theory, MIT Press, Cambridge, MA, 1994.

[19]

Romanovskii, I., "Reduction of a game with complete memory to a matrix game," Soviet Math., v3, pp. 678-681, 1962.

[20]

Shi, J. and Littman, M., "Abstraction methods for game-theoretic poker," Computers and Games, Springer-Verlag, Berlin, pp. 333-345, 2001.

[21]

von Stengel, B., "Efficient computation of behavior strategies," Games and Econom. Behavior, v14, pp. 220-246, 1996.

[22]

von Stengel, B., Nisan, N., Roughgarden, T., Tardos, E. and Vazirani, V. V., "Equilibrium computation for games in strategic and extensive form," Algorithmic Game Theory, Cambridge University Press, Cambridge, UK, pp. 53-78, 2007.

[23]

Zinkevich, M., Bowling, M., Johanson, M. and Piccione, C., "Regret minimization in games with incomplete information," Annual Conf. Neural Inform. Processing Systems (NIPS), 2007.

Cited By

Munos RValko MCalandriello DAzar MRowland MGuo DTang YGeist MMesnard TFiegel CMichi ASelvi MGirgin SMomchev NBachem OMankowitz DPrecup DPiot BSalakhutdinov RKolter ZHeller KWeller AOliver NScarlett JBerkenkamp F(2024)Nash learning from human feedbackProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3693563(36743-36768)Online publication date: 21-Jul-2024
https://dl.acm.org/doi/10.5555/3692070.3693563
Yao JLiu WFu HYang YMcAleer SFu QYang WOh ANaumann TGloberson ASaenko KHardt MLevine S(2023)Policy space diversity for non-transitive gamesProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3669086(67771-67793)Online publication date: 10-Dec-2023
https://dl.acm.org/doi/10.5555/3666122.3669086
Farina GGrand-Ciément JKroer CLee CLuo HOh ANaumann TGloberson ASaenko KHardt MLevine S(2023)Regret matching+Proceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3668812(61546-61572)Online publication date: 10-Dec-2023
https://dl.acm.org/doi/10.5555/3666122.3668812
Show More Cited By

Recommendations

Sequential Stackelberg equilibria in two-person games

The concept of sequential Stackelberg equilibrium is introduced in the general framework of dynamic, two-person games defined in the Denardo contracting operator formalism. A relationship between this solution concept and the sequential Nash equilibrium ...
Computing approximate pure Nash equilibria in congestion games

Among other solution concepts, the notion of the pure Nash equilibrium plays a central role in Game Theory. Pure Nash equilibria in a game characterize situations with non-cooperative deterministic players in which no player has any incentive to ...
Computing Approximate Nash Equilibria in Polymatrix Games

In an $$\epsilon $$∈-Nash equilibrium, a player can gain at most $$\epsilon $$∈ by unilaterally changing his behavior. For two-player (bimatrix) games with payoffs in [0, 1], the best-known $$\epsilon $$∈ achievable in polynomial time is 0.3393 (...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Mathematics of Operations Research

Mathematics of Operations Research Volume 35, Issue 2

May 2010

241 pages

ISSN:0364-765X

Issue’s Table of Contents

Publisher

INFORMS

Linthicum, MD, United States

Publication History

Published: 01 May 2010

Received: 31 March 2008

Author Tags

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

54
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Munos RValko MCalandriello DAzar MRowland MGuo DTang YGeist MMesnard TFiegel CMichi ASelvi MGirgin SMomchev NBachem OMankowitz DPrecup DPiot BSalakhutdinov RKolter ZHeller KWeller AOliver NScarlett JBerkenkamp F(2024)Nash learning from human feedbackProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3693563(36743-36768)Online publication date: 21-Jul-2024
https://dl.acm.org/doi/10.5555/3692070.3693563
Yao JLiu WFu HYang YMcAleer SFu QYang WOh ANaumann TGloberson ASaenko KHardt MLevine S(2023)Policy space diversity for non-transitive gamesProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3669086(67771-67793)Online publication date: 10-Dec-2023
https://dl.acm.org/doi/10.5555/3666122.3669086
Farina GGrand-Ciément JKroer CLee CLuo HOh ANaumann TGloberson ASaenko KHardt MLevine S(2023)Regret matching+Proceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3668812(61546-61572)Online publication date: 10-Dec-2023
https://dl.acm.org/doi/10.5555/3666122.3668812
Chakrabarti DDiakonikolas JKroer COh ANaumann TGloberson ASaenko KHardt MLevine S(2023)Block-coordinate methods and restarting for solving extensive-form gamesProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3666592(10692-10720)Online publication date: 10-Dec-2023
https://dl.acm.org/doi/10.5555/3666122.3666592
Meng LGe ZTian PAn BGao YWilliams BChen YNeville J(2023)An efficient deep reinforcement learning algorithm for solving imperfect information extensive-form gamesProceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on Educational Advances in Artificial Intelligence10.1609/aaai.v37i5.25722(5823-5831)Online publication date: 7-Feb-2023
https://dl.acm.org/doi/10.1609/aaai.v37i5.25722
Xie KLiu WLi B(2023)Neural Future-Dependent Online Mirror DescentProceedings of the 2023 6th International Conference on Algorithms, Computing and Artificial Intelligence10.1145/3639631.3639632(1-6)Online publication date: 22-Dec-2023
https://dl.acm.org/doi/10.1145/3639631.3639632
Wu FBian W(2023)Smoothing Accelerated Proximal Gradient Method with Fast Convergence Rate for Nonsmooth Convex Optimization Beyond DifferentiabilityJournal of Optimization Theory and Applications10.1007/s10957-023-02176-6197:2(539-572)Online publication date: 4-Mar-2023
https://dl.acm.org/doi/10.1007/s10957-023-02176-6
Bian WWu F(2023)Accelerated Smoothing Hard Thresholding Algorithms for Regularized Nonsmooth Convex Regression ProblemJournal of Scientific Computing10.1007/s10915-023-02249-896:2Online publication date: 15-Jun-2023
https://dl.acm.org/doi/10.1007/s10915-023-02249-8
Ganzfried S(2023)Observable Perfect EquilibriumDecision and Game Theory for Security10.1007/978-3-031-50670-3_1(3-22)Online publication date: 18-Oct-2023
https://dl.acm.org/doi/10.1007/978-3-031-50670-3_1
Bai YJin CMei SSong ZYu TKoyejo SMohamed SAgarwal ABelgrave DCho KOh A(2022)Efficient Phi-regret minimization in extensive-form games via online mirror descentProceedings of the 36th International Conference on Neural Information Processing Systems10.5555/3600270.3601891(22313-22325)Online publication date: 28-Nov-2022
https://dl.acm.org/doi/10.5555/3600270.3601891
Show More Cited By

View Options

View options

Figures

Tables

Media

View Issue’s Table of Contents