Abstract
Modeling and predicting player behavior is of the utmost importance in game development and matchmaking. A variety of methods have been proposed to build artificial intelligence (AI), human-like players. However, these human-like players have a limited ability to imitate the behavior of individual players. In this paper, we propose a player behavior imitation method using imitation learning under the framework of meta-learning. A generic behavior model of game players was learned from historical records using adversarial imitation learning. Then, we personalized the policy by imitating the behavior of each individual player. Convolutional neural networks were used to construct the feature extractor of game board states. The experiments were conducted using the Reversi game, and 18,000 game records of different players were used to train the generic behavior model. The behavior of each new player was learned using only hundreds of records. The results demonstrate that our method can be utilized to imitate individual behavior in terms of action similarity well.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Levesque HJ (2017) Common sense, the turing test, and the quest for real AI. mit press
McIlroy-Young R, Sen S, Kleinberg J, Anderson A (2020) Aligning superhuman AI with human behavior: chess as a model system. In: SIGKDD, pp 1677–1687
McIlroy-Young R, Wang Y, Sen S, Kleinberg J, Anderson A (2021) Detecting individual decision-making style: Exploring behavioral stylometry in chess. Adv Neural Inf Process Syst 34:24482–24497
Tian Y, Ma J, Gong Q, Sengupta S, Chen Z, Pinkerton J, Zitnick L (2019) ELF OpenGo: an analysis and open reimplementation of AlphaZero. In: PMLR, vol 97. pp 6244–6253
Lee CS, Tsai YL, Wang MH, Kuan WK, Ciou ZH, Kubota N (2020) AI-FML agent for robotic game of Go and AIoT real-world co-learning applications. In: FUZZ-IEEE, pp 1–8
Liskowski P, Jaśkowski W, Krawiec K (2018) Learning to play Othello with deep neural networks. IEEE Trans Games 10(4):354–364
Norelli A, Panconesi A (2022) OLIVAW: mastering Othello without human knowledge, nor a penny. IEEE Transactions on Games :1–1
Gao C, Hayward R, Müller M (2017) Move prediction using deep convolutional neural networks in Hex. IEEE Trans Games 10(4):336–343
Gao C, Müller M, Hayward R (2018) Three-head neural network architecture for Monte Carlo tree search. In: IJCAI, pp 3762–3768
Brown N, Sandholm T (2019) Superhuman AI for multiplayer poker. Science 365(6456):885–890
Li X, Miikkulainen R (2017) Evolving adaptive poker players for effective opponent exploitation. In: AAAI Workshops
Powley E, Cowling P, Whitehouse D (2017) Memory bounded Monte Carlo tree search. In: AIIDE, vol 13. pp 94–100
Baier H, Sattaur A, Powley E J, Devlin S, Rollason J, Cowling P I (2019) Emulating human play in a leading mobile card game. IEEE Trans Games 11(4):386–395
Kurita M, Hoki K (2021) Method for constructing artificial intelligence player with abstractions to Markov decision processes in multiplayer game of Mahjong. IEEE Trans Games 13(1):99–110
Zheng Y, Li S (2020) A review of Mahjong AI research. In: ICRIC, pp 345–349
Smith JR, Joshi D, Huet B, Hsu WH, Cota J (2017) Harnessing A.I. for augmenting creativity: application to movie trailer creation. In: ACM-MM, pp 1799–1808
Ghimire A, Thapa S, Jha AK, Adhikari S, Kumar A (2020) Accelerating business growth with big data and artificial intelligence. In: I-SMAC, pp 441–448
Valle-Cruz D, Alejandro Ruvalcaba-Gomez E, Sandoval-Almazan R, Ignacio Criado J (2019) A review of artificial intelligence in government and its potential from a public policy perspective. In: DG.O, pp 91–99
Schrittwieser J, Antonoglou I, Hubert T, Simonyan K, Sifre L, Schmitt S, Guez A, Lockhart E, Hassabis D, Graepel T, et al. (2020) Mastering Atari, Go, chess and shogi by planning with a learned model. Nature 588(7839):604–609
Vinyals O, Babuschkin I, Czarnecki W M, Mathieu M, Dudzik A, Chung J, Choi D H, Powell R, Ewalds T, Georgiev P, et al. (2019) Grandmaster level in StarCraft II using multi-agent reinforcement learning. Nature 575(7782):350–354
Partlan N, Madkour A, Jemmali C, Miller JA, Holmgård C, El-Nasr MS (2019) Player imitation for build actions in a real-time strategy game. In: AIIDE
Sutton RS, Barto AG (2018) Reinforcement learning: an introduction. MIT press
Boucherie RJ, Van Dijk NM (2017) Markov decision processes in practice. Springer
Hussein A, Gaber MM, Elyan E, Jayne C (2017) Imitation learning: a survey of learning methods. ACM Comput Surv 50(2):1–35
Osa T, Pajarinen J, Neumann G, Bagnell JA, Abbeel P, Peters J, et al. (2018) An algorithmic perspective on imitation learning. Found Trends Robot 7(1-2):1–179
Fang B, Jia S, Guo D, Xu M, Wen S, Sun F (2019) Survey of imitation learning for robotic manipulation. Int J Intell Robot Appl 3(4):362–369
Wu Z, Sun L, Zhan W, Yang C, Tomizuka M (2020) Efficient sampling-based maximum entropy inverse reinforcement learning with application to autonomous driving. IEEE Robot Autom Lett 5 (4):5355–5362
Haarnoja T, Zhou A, Abbeel P, Levine S (2018) Soft actor-critic: off-policy maximum entropy deep reinforcement learning with a stochastic actor. In: PMLR, vol 80. pp 1861–1870
Song J, Ren H, Sadigh D, Ermon S (2018) Multi-agent generative adversarial imitation learning. Adv Neural Inf Process Syst :31
Yu L, Song J, Ermon S (2019) Multi-agent adversarial inverse reinforcement learning. In: PMLR, vol 97. pp 7194–7201
Zhou A, Jang E, Kappler D, Herzog A, Khansari M, Wohlhart P, Bai Y, Kalakrishnan M, Levine S, Finn C (2019) Watch, try, learn: meta-learning from demonstrations and rewards. In: ICLR
Shani L, Efroni Y, Mannor S (2020) Adaptive trust region policy optimization: global convergence and faster rates for regularized mdps. In: AAAI, vol 34. pp 5668–5675
Huisman M, Van Rijn JN, Plaat A (2021) A survey of deep meta-learning. Artif Intell Rev 54(6):4483–4541
Sewak M (2019) Deep Q Network (DQN), Double DQN, and Dueling DQN, Springer, pp 95–108
Tao X, Hafid AS (2020) Deepsensing: a novel mobile crowdsensing framework with double deep q-network and prioritized experience replay. IEEE Internet Things J 7(12):11547– 11558
Zhou Y, Li W (2020) Discovering of game AIs’ characters using a neural network based AI imitator for AI clustering. In: IEEE CIG. vol 2020-Augus. pp 198–205
Aloysius N, Geetha M (2017) A review on deep convolutional neural networks. In: ICCSP, pp 0588–0592
Irfan A, Zafar A, Hassan S (2019) Evolving levels for general games using deep convolutional generative adversarial networks. In: CEEC, pp 96–101
Pfau J, Liapis A, Volkmar G, Yannakakis GN, Malaka R (2020) Dungeons & replicants: automated game balancing via deep player behavior modeling. In: IEEE CoG, pp 431–438
Wang K, Gou C, Duan Y, Lin Y, Zheng X, Wang FY (2017) Generative adversarial networks: introduction and outlook. IEEE/CAA J Autom Sin 4(4):588–598
Aggarwal A, Mittal M, Battineni G (2021) Generative adversarial network: an overview of theory and applications. Int J Inf Manag Data Insights 1(1):100004–0
Finn C, Abbeel P, Levine S (2017) Model-agnostic meta-learning for fast adaptation of deep networks. In: ICML, vol 70. pp 1126–1135
Finn C, Yu T, Zhang T, Abbeel P, Levine S (2017) One-shot visual imitation learning via meta-learning. In: CoRL, vol 78. pp 357–368
Zhou H, Zhang H, Zhou Y, Wang X, Li W (2018) Botzone: an online multi-agent competitive platform for AI education. In: ITiCSE, pp 33–38
Kovalchik S (2020) Extension of the elo rating system to margin of victory. Int J Forecast 36 (4):1329–1341
Tang Z, Zhu Y, Zhao D, Lucas SM (2020) Enhanced rolling horizon evolution algorithm with opponent model learning. IEEE Trans Games :1–1
Carneiro MG, De Lisboa GA (2018) What’s the next move? Learning player strategies in Zoom Poker games. In: CEC, pp 1–8
Wan S, Kaneko T (2017) Imitation learning for playing Shogi based on generative adversarial networks. In: TAAI, pp 92–95
Yan P, Feng Y (2018) A hybrid gomoku deep learning artificial intelligence. In: AICCC, pp 48–52
Laskin M, Lee K, Stooke A, Pinto L, Abbeel P, Srinivas A (2020) Reinforcement learning with augmented data. Adv Neural Inf Process Syst 33:19884–19895
Acknowledgements
This work is in part supported by the Central Government Funds of Guiding Local Scientific and Technological Development (No. 2021ZYD0003), the National Natural Science Foundation of China (No. 62006200), the Sichuan Province Youth Science and Technology Innovation Team (No. 2019JDTD0017), and the Nanchong Municipal Government-Universities Scientific Cooperation Project (No. SXHZ045).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Pan, CF., Min, XY., Zhang, HR. et al. Behavior imitation of individual board game players. Appl Intell 53, 11571–11585 (2023). https://doi.org/10.1007/s10489-022-04050-w
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-022-04050-w