Article

Feature construction for reinforcement learning in hearts

Authors:

Nathan R. Sturtevant,

Adam M. WhiteAuthors Info & Claims

CG'06: Proceedings of the 5th international conference on Computers and games

Pages 122 - 134

Published: 29 May 2006 Publication History

Abstract

Temporal difference (TD) learning has been used to learn strong evaluation functions in a variety of two-player games. TD-gammon illustrated how the combination of game tree search and learning methods can achieve grand-master level play in backgammon. In this work, we develop a player for the game of hearts, a 4-player game, based on stochastic linear regression and TD learning. Using a small set of basic game features we exhaustively combined features into a more expressive representation of the game state. We report initial results on learning with various combinations of features and training under self-play and against search-based players. Our simple learner was able to beat one of the best search-based hearts programs.

References

[1]

Baxter, J., Trigdell, A., Weaver, L.: Knightcap: a Chess Program that Learns by Combining TD(λ) with Game-Tree Search. In: Proc. 15th International Conf. on Machine Learning, pp. 28-36. Morgan Kaufmann, San Francisco, CA (1998).

Digital Library

[2]

Buro, M.: From Simple Features to Sophisticated Evaluation Functions. In: van den Herik, J., Iida, H. (eds.) CG 1998. LNCS, vol. 1558, pp. 126-145. Springer, Heidelberg (1999).

Digital Library

[3]

Fujita, H., Ishii, S.: Model-based Reinforcement Learning for Partially Observable Games with Sampling-based State Estimation. In: Advances in Neural Information Processing Systems, Workshop on Game Theory, Machine Learning and Reasoning under Uncertainty (2005).

[4]

Frünkranz, J., Pfahringer, B., Kaindl, H., Kramer, S.: Learning to Use Operational Advice. In: Proc. of the 14th European Conference on A.I. (2000).

[5]

Ginsberg, M.: GiB: Imperfect Information in a Computationally Challenging Game (2001).

[6]

Kuvayev, L.: Learning to Play Hearts. In: Proceedings of the 14th National Conference on Artificial Intelligence (AAAI-97) (1997).

Digital Library

[7]

Luckhardt, C., Irani, K.: An Algorithmic Solution of N-Person Games. In: AAAI- 86, vol. 1, pp. 158-162 (1986).

[8]

Mitchell, T.M.: Machine Learning. McGraw-Hill, New York (1997).

Digital Library

[9]

Perkins, T.: Two Search Techniques for Imperfect Information Games and Application to Hearts. University of Massachusetts Technical Report, pp. 98-71 (1998).

Digital Library

[10]

Russell, S., Norvig, P.: Artificial Intelligence: A Modern Approach, 2nd edn. Prentice-Hall, Englewood Cliffs, NJ (2002).

Digital Library

[11]

Stone, P., Sutton, R.S.: Scaling Reinforcement Learning toward RoboCup Soccer. In: Proc. 18th ICML, pp. 537-544. Morgan Kaufmann, San Francisco, CA (2001).

Digital Library

[12]

Sturtevant, N.R.: Multi-Player Games: Algorithms and Approaches. PhD thesis, Computer Science Department, UCLA (2003).

Digital Library

[13]

Sturtevant, N.R., Bowling, M.H.: Robust Game Play against Unknown Opponents. In: AAMAS-2006, pp. 713-719 (2006).

Digital Library

[14]

Sturtevant, N.R., Korf, R.E.: On Pruning Techniques for Multi-Player Games. In: AAAI-2000 (2000).

Digital Library

[15]

Sutton, R., Barto, A.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998).

Digital Library

[16]

Tesauro, G.: Temporal Difference Learning and TD-Gammon. Communications of the ACM 38(3), 58-68 (1995).

Digital Library

Cited By

Buro MLong JFurtak TSturtevant N(2009)Improving state evaluation, inference, and search in trick-based card gamesProceedings of the 21st International Joint Conference on Artificial Intelligence10.5555/1661445.1661671(1407-1413)Online publication date: 11-Jul-2009
https://dl.acm.org/doi/10.5555/1661445.1661671
Sutton RMaei HPrecup DBhatnagar SSilver DSzepesvári CWiewiora EDanyluk ABottou LLittman M(2009)Fast gradient-descent methods for temporal-difference learning with linear function approximationProceedings of the 26th Annual International Conference on Machine Learning10.1145/1553374.1553501(993-1000)Online publication date: 14-Jun-2009
https://dl.acm.org/doi/10.1145/1553374.1553501
Skowronski PBjörnsson YWinands M(2009)Automated discovery of search-extension featuresProceedings of the 12th international conference on Advances in Computer Games10.1007/978-3-642-12993-3_17(182-194)Online publication date: 11-May-2009
https://dl.acm.org/doi/10.1007/978-3-642-12993-3_17
Show More Cited By

Feature construction for reinforcement learning in hearts

Recommendations

Kingdom Hearts OSG
Multi-agent reinforcement learning in games
Quantifying the Space of Hearts Variants
Advances in Computer Games
Abstract
Hearts is a card game with a rich history and many interesting variants. Why has it remained popular while undergoing significant changes? We use computational simulations of Hearts to understand the experience of players through the application ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings

CG'06: Proceedings of the 5th international conference on Computers and games

May 2006

283 pages

ISBN:3540755373

Editors:
H. Jaap Van Den Herik
Institute for Knowledge and Agent Technology, MICC, Universiteit Maastricht, Maastricht, The Netherlands
,
Paolo Ciancarini
Dipartimento di Scienze dell'Informazione, Università di Bologna, Bologna, Italy
,
H. H. L. M. Donkers
Institute for Knowledge and Agent Technology, MICC, Universiteit Maastricht, Maastricht, The Netherlands

Sponsors

Chessbase
Fiat Group
Provincia di Torino
Cittá di Torino
Regione Piermonte

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 29 May 2006

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

4
Total Citations
View Citations
3
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 12 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Buro MLong JFurtak TSturtevant N(2009)Improving state evaluation, inference, and search in trick-based card gamesProceedings of the 21st International Joint Conference on Artificial Intelligence10.5555/1661445.1661671(1407-1413)Online publication date: 11-Jul-2009
https://dl.acm.org/doi/10.5555/1661445.1661671
Sutton RMaei HPrecup DBhatnagar SSilver DSzepesvári CWiewiora EDanyluk ABottou LLittman M(2009)Fast gradient-descent methods for temporal-difference learning with linear function approximationProceedings of the 26th Annual International Conference on Machine Learning10.1145/1553374.1553501(993-1000)Online publication date: 14-Jun-2009
https://dl.acm.org/doi/10.1145/1553374.1553501
Skowronski PBjörnsson YWinands M(2009)Automated discovery of search-extension featuresProceedings of the 12th international conference on Advances in Computer Games10.1007/978-3-642-12993-3_17(182-194)Online publication date: 11-May-2009
https://dl.acm.org/doi/10.1007/978-3-642-12993-3_17
Silver DSutton RMüller M(2007)Reinforcement learning of local shape in the game of goProceedings of the 20th international joint conference on Artifical intelligence10.5555/1625275.1625446(1053-1058)Online publication date: 6-Jan-2007
https://dl.acm.org/doi/10.5555/1625275.1625446

View Options

View options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents