Nothing Special   »   [go: up one dir, main page]

skip to main content
10.5555/1777826.1777837guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Feature construction for reinforcement learning in hearts

Published: 29 May 2006 Publication History

Abstract

Temporal difference (TD) learning has been used to learn strong evaluation functions in a variety of two-player games. TD-gammon illustrated how the combination of game tree search and learning methods can achieve grand-master level play in backgammon. In this work, we develop a player for the game of hearts, a 4-player game, based on stochastic linear regression and TD learning. Using a small set of basic game features we exhaustively combined features into a more expressive representation of the game state. We report initial results on learning with various combinations of features and training under self-play and against search-based players. Our simple learner was able to beat one of the best search-based hearts programs.

References

[1]
Baxter, J., Trigdell, A., Weaver, L.: Knightcap: a Chess Program that Learns by Combining TD(λ) with Game-Tree Search. In: Proc. 15th International Conf. on Machine Learning, pp. 28-36. Morgan Kaufmann, San Francisco, CA (1998).
[2]
Buro, M.: From Simple Features to Sophisticated Evaluation Functions. In: van den Herik, J., Iida, H. (eds.) CG 1998. LNCS, vol. 1558, pp. 126-145. Springer, Heidelberg (1999).
[3]
Fujita, H., Ishii, S.: Model-based Reinforcement Learning for Partially Observable Games with Sampling-based State Estimation. In: Advances in Neural Information Processing Systems, Workshop on Game Theory, Machine Learning and Reasoning under Uncertainty (2005).
[4]
Frünkranz, J., Pfahringer, B., Kaindl, H., Kramer, S.: Learning to Use Operational Advice. In: Proc. of the 14th European Conference on A.I. (2000).
[5]
Ginsberg, M.: GiB: Imperfect Information in a Computationally Challenging Game (2001).
[6]
Kuvayev, L.: Learning to Play Hearts. In: Proceedings of the 14th National Conference on Artificial Intelligence (AAAI-97) (1997).
[7]
Luckhardt, C., Irani, K.: An Algorithmic Solution of N-Person Games. In: AAAI- 86, vol. 1, pp. 158-162 (1986).
[8]
Mitchell, T.M.: Machine Learning. McGraw-Hill, New York (1997).
[9]
Perkins, T.: Two Search Techniques for Imperfect Information Games and Application to Hearts. University of Massachusetts Technical Report, pp. 98-71 (1998).
[10]
Russell, S., Norvig, P.: Artificial Intelligence: A Modern Approach, 2nd edn. Prentice-Hall, Englewood Cliffs, NJ (2002).
[11]
Stone, P., Sutton, R.S.: Scaling Reinforcement Learning toward RoboCup Soccer. In: Proc. 18th ICML, pp. 537-544. Morgan Kaufmann, San Francisco, CA (2001).
[12]
Sturtevant, N.R.: Multi-Player Games: Algorithms and Approaches. PhD thesis, Computer Science Department, UCLA (2003).
[13]
Sturtevant, N.R., Bowling, M.H.: Robust Game Play against Unknown Opponents. In: AAMAS-2006, pp. 713-719 (2006).
[14]
Sturtevant, N.R., Korf, R.E.: On Pruning Techniques for Multi-Player Games. In: AAAI-2000 (2000).
[15]
Sutton, R., Barto, A.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998).
[16]
Tesauro, G.: Temporal Difference Learning and TD-Gammon. Communications of the ACM 38(3), 58-68 (1995).

Cited By

View all
  • (2009)Improving state evaluation, inference, and search in trick-based card gamesProceedings of the 21st International Joint Conference on Artificial Intelligence10.5555/1661445.1661671(1407-1413)Online publication date: 11-Jul-2009
  • (2009)Fast gradient-descent methods for temporal-difference learning with linear function approximationProceedings of the 26th Annual International Conference on Machine Learning10.1145/1553374.1553501(993-1000)Online publication date: 14-Jun-2009
  • (2009)Automated discovery of search-extension featuresProceedings of the 12th international conference on Advances in Computer Games10.1007/978-3-642-12993-3_17(182-194)Online publication date: 11-May-2009
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings
CG'06: Proceedings of the 5th international conference on Computers and games
May 2006
283 pages
ISBN:3540755373
  • Editors:
  • H. Jaap Van Den Herik,
  • Paolo Ciancarini,
  • H. H. L. M. Donkers

Sponsors

  • Chessbase
  • Fiat Group
  • Provincia di Torino
  • Cittá di Torino
  • Regione Piermonte

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 29 May 2006

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 12 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2009)Improving state evaluation, inference, and search in trick-based card gamesProceedings of the 21st International Joint Conference on Artificial Intelligence10.5555/1661445.1661671(1407-1413)Online publication date: 11-Jul-2009
  • (2009)Fast gradient-descent methods for temporal-difference learning with linear function approximationProceedings of the 26th Annual International Conference on Machine Learning10.1145/1553374.1553501(993-1000)Online publication date: 14-Jun-2009
  • (2009)Automated discovery of search-extension featuresProceedings of the 12th international conference on Advances in Computer Games10.1007/978-3-642-12993-3_17(182-194)Online publication date: 11-May-2009
  • (2007)Reinforcement learning of local shape in the game of goProceedings of the 20th international joint conference on Artifical intelligence10.5555/1625275.1625446(1053-1058)Online publication date: 6-Jan-2007

View Options

View options

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media