Nothing Special   »   [go: up one dir, main page]

skip to main content
article

Smoothing Techniques for Computing Nash Equilibria of Sequential Games

Published: 01 May 2010 Publication History

Abstract

We develop first-order smoothing techniques for saddle-point problems that arise in finding a Nash equilibrium of sequential games. The crux of our work is a construction of suitable prox-functions for a certain class of polytopes that encode the sequential nature of the game. We also introduce heuristics that significantly speed up the algorithm, and decomposed game representations that reduce the memory requirements, enabling the application of the techniques to drastically larger games. An implementation based on our smoothing techniques computes approximate Nash equilibria for games that are more than four orders of magnitude larger than what prior approaches can handle. Finally, we show near-linear further speedups from parallelization.

References

[1]
Billings, D., Peòa, L., Schaeffer, J. and Szafron, D., "The challenge of poker," Artificial Intelligence, v134, pp. 201-240, 2002.
[2]
Billings, D., Burch, N., Davidson, A., Holte, R., Schaeffer, J., Schauenberg, T. and Szafron, D., "Approximating game-theoretic optimal strategies for full-scale poker," Proc. 18th Internat. Joint Conf. Artificial Intelligence (IJCAI), 2003.
[3]
Gilpin, A., "Algorithms for abstracting and solving imperfect information games," 2009.
[4]
Gilpin, A. and Sandholm, T., "A competitive Texas Hold'em poker player via automated abstraction and real-time equilibrium computation," Proc. National Conf. Artificial Intelligence (AAAI), 2006.
[5]
Gilpin, A. and Sandholm, T., "Lossless abstraction method for sequential games of imperfect information," J. ACM, v54, 2007.
[6]
Gilpin, A., Sandholm, T. and Sørensen, T. B., "Potential-aware automated abstraction of sequential games, and holistic equilibrium analysis of Texas Hold'em poker," Proc. National Conf. Artificial Intelligence (AAAI), 2007.
[7]
Gilpin, A., Sandholm, T. and Sørensen, T. B., "A heads-up no-limit Texas Hold'em poker player: Discretized betting models and automatically generated equilibrium-finding programs," Internat. Conf. Autonomous Agents and Multi-Agent Systems (AAMAS), 2008.
[8]
Goffin, J.-L., "On the convergence rate of subgradient optimization methods," Math. Programming, v13, pp. 329-347, 1977.
[9]
Hirriart-Urruty, J. and Lemaréchal, C., Fundamentals of Convex Analysis, Springer-Verlag, Berlin, 2001.
[10]
Juditsky, A., Lan, G., Nemirovski, A. and Shapiro, A., "Stochastic approximation approach to stochastic programming," SIAM J. Optim., v19, pp. 1574-1609, 2009.
[11]
Koller, D., Megiddo, N. and von Stengel, B., "Efficient computation of equilibria for extensive two-person games," Games Econom. Behavior, v14, pp. 247-259, 1996.
[12]
Lan, G., Lu, Z. and Monteiro, R., "Primal-dual first-order methods with O(1/ε) iteration-complexity for cone programming," Math. Programming., 2010.
[13]
Nemirovski, A., "Prox-method with rate of convergence O(1/t) for variational inequalities with Lipschitz-continuous monotone operators and smooth convex-concave saddle point problems," SIAM J. Optim., v15, pp. 229-251, 2004.
[14]
Nesterov, Y., "A method for unconstrained convex minimization problem with rate of convergence O(1/k2)," Doklady AN SSSR (In Russian), v269, pp. 543-547, 1983.
[15]
Nesterov, Y., "Introductory lectures on convex optimization: A basic course," Applied Optimization, Kluwer Academic Publishers, Boston, 2004.
[16]
Nesterov, Y., "Excessive gap technique in nonsmooth convex minimization," SIAM J. Optim., v16, pp. 235-249, 2005.
[17]
Nesterov, Y., "Smooth minimization of non-smooth functions," Math. Programming, v103, pp. 127-152, 2005.
[18]
Osborne, M. and Rubinstein, A., A Course in Game Theory, MIT Press, Cambridge, MA, 1994.
[19]
Romanovskii, I., "Reduction of a game with complete memory to a matrix game," Soviet Math., v3, pp. 678-681, 1962.
[20]
Shi, J. and Littman, M., "Abstraction methods for game-theoretic poker," Computers and Games, Springer-Verlag, Berlin, pp. 333-345, 2001.
[21]
von Stengel, B., "Efficient computation of behavior strategies," Games and Econom. Behavior, v14, pp. 220-246, 1996.
[22]
von Stengel, B., Nisan, N., Roughgarden, T., Tardos, E. and Vazirani, V. V., "Equilibrium computation for games in strategic and extensive form," Algorithmic Game Theory, Cambridge University Press, Cambridge, UK, pp. 53-78, 2007.
[23]
Zinkevich, M., Bowling, M., Johanson, M. and Piccione, C., "Regret minimization in games with incomplete information," Annual Conf. Neural Inform. Processing Systems (NIPS), 2007.

Cited By

View all
  • (2024)Nash learning from human feedbackProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3693563(36743-36768)Online publication date: 21-Jul-2024
  • (2023)Policy space diversity for non-transitive gamesProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3669086(67771-67793)Online publication date: 10-Dec-2023
  • (2023)Regret matching+Proceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3668812(61546-61572)Online publication date: 10-Dec-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Mathematics of Operations Research
Mathematics of Operations Research  Volume 35, Issue 2
May 2010
241 pages

Publisher

INFORMS

Linthicum, MD, United States

Publication History

Published: 01 May 2010
Received: 31 March 2008

Author Tags

  1. Nash equilibrium
  2. sequential games
  3. smoothing techniques

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 05 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Nash learning from human feedbackProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3693563(36743-36768)Online publication date: 21-Jul-2024
  • (2023)Policy space diversity for non-transitive gamesProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3669086(67771-67793)Online publication date: 10-Dec-2023
  • (2023)Regret matching+Proceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3668812(61546-61572)Online publication date: 10-Dec-2023
  • (2023)Block-coordinate methods and restarting for solving extensive-form gamesProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3666592(10692-10720)Online publication date: 10-Dec-2023
  • (2023)An efficient deep reinforcement learning algorithm for solving imperfect information extensive-form gamesProceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on Educational Advances in Artificial Intelligence10.1609/aaai.v37i5.25722(5823-5831)Online publication date: 7-Feb-2023
  • (2023)Neural Future-Dependent Online Mirror DescentProceedings of the 2023 6th International Conference on Algorithms, Computing and Artificial Intelligence10.1145/3639631.3639632(1-6)Online publication date: 22-Dec-2023
  • (2023)Smoothing Accelerated Proximal Gradient Method with Fast Convergence Rate for Nonsmooth Convex Optimization Beyond DifferentiabilityJournal of Optimization Theory and Applications10.1007/s10957-023-02176-6197:2(539-572)Online publication date: 4-Mar-2023
  • (2023)Accelerated Smoothing Hard Thresholding Algorithms for Regularized Nonsmooth Convex Regression ProblemJournal of Scientific Computing10.1007/s10915-023-02249-896:2Online publication date: 15-Jun-2023
  • (2023)Observable Perfect EquilibriumDecision and Game Theory for Security10.1007/978-3-031-50670-3_1(3-22)Online publication date: 18-Oct-2023
  • (2022)Efficient Phi-regret minimization in extensive-form games via online mirror descentProceedings of the 36th International Conference on Neural Information Processing Systems10.5555/3600270.3601891(22313-22325)Online publication date: 28-Nov-2022
  • Show More Cited By

View Options

View options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media