research-article

Intuitionistic Fuzzy MADM in Wargame Leveraging With Deep Reinforcement Learning

Authors:

Xianzhong ZhouAuthors Info & Claims

IEEE Transactions on Fuzzy Systems, Volume 32, Issue 9

Pages 5033 - 5045

https://doi.org/10.1109/TFUZZ.2024.3435400

Published: 01 September 2024 Publication History

Abstract

Presently, intelligent games have emerged as a substantial research area. Nonetheless, the slow convergence of intelligent wargame training and the low success rates of agents against specific rules present challenges. In this article, we propose a game confrontation algorithm combining the multiple attribute decision making (MADM) approach from management science and reinforcement learning (RL) technology. This integration enables us to combine the strengths of both approaches and addresses the above issues effectively. This study conducts experiments using the algorithm that integrates MADM and RL techniques to gather confrontation data from the red and blue sides within the winning-first wargame platform. The data is then analyzed using the weight calculation method of intuitionistic fuzzy numbers to determine each intelligent opponent agent's threat level from the perspective of MADM. The threat level calculated by MADM is used to construct the reward function for the red side. The simulation results demonstrate that the algorithm combining MADM and RL proposed in this study outperforms classical RL algorithms regarding intelligence. This approach effectively addresses issues, such as the convergence difficulty, caused by random initialization and the sparse rewards for agent neural networks in wargame environments with large maps. Combining the MADM method from management with the RL algorithm in control can lead to cross-disciplinary innovation in academic fields, which provides innovative research values for intelligent wargame design and RL algorithm improvements.

References

[1]

Z. J. Pang, R. Z. Liu, Z. Y. Meng, Y. Zhang, Y. Yu, and T. Lu, “On reinforcement learning for full-length game of starcraft,” in Proc. AAAI Conf. Artif. Intell., 2019, pp. 4691–4698.

[2]

D. Silver et al., “Mastering the game of go with deep neural networks and tree search,” Nature, vol. 529, no. 7587, pp. 484–489, Jan. 2016.

[3]

D. H. Ye et al., “Mastering complex control in moba games with deep reinforcement learning,” in Proc. AAAI Conf. Artif. Intell., Apr. 2020, pp. 6672–6679.

[4]

D. Silver et al., “Mastering the game of Go without human knowledge,” Nature, vol. 550, no. 7676, pp. 354–359, Oct. 2017.

[5]

N. A. Barriga, M. Stanescu, F. Besoain, and M. Buro, “Improving RTS game AI by supervised policy learning, tactical search, and deep reinforcement learning,” IEEE Comput. Intell. Mag., vol. 14, no. 3, pp. 8–18, Aug. 2019.

[6]

J. Schrittwieser et al., “Mastering Atari, Go, Chess and Shogi by planning with a learned model,” Nature, vol. 588, no. 7839, pp. 604–609, Dec. 2020.

[7]

N. A. Barriga, M. Stanescu, and M. Buro, “Combining strategic learning with tactical search in real-time strategy games,” in Proc. AAAI Conf. Artif. Intell. Interactive Digit. Entertainment, Jun. 2017, pp. 9–15.

[8]

X. F. Hu and D. W. Qi, “On problems of intelligent decision-making-how far is it from game-playing to operational command,” J. Command Control, vol. 6, no. 4, pp. 356–363, Dec. 2020.

[9]

Z. Zhang, Y. Y. Huang, Y. L. Zhang, and T. D. Chen, “Battle entity confrontation algorithm based on proximal policy optimization,” J. Nanjing Univ. Sci. Technol., vol. 45, pp. 77–83, Mar. 2021.

[10]

C. Li, Y. Y. Huang, Y. L. Zhang, and T. D. Chen, “Multi-agent decision making method based on actor-critic framework and its application in wargame,” Syst. Eng. Electron., vol. 43, no. 3, pp. 755–762, Mar. 2021.

[11]

K. Cheng, G. Chen, X. H. Yu, M. Liu, and T. H. Shao, “Knowledge traction and data-driven wargame AI design and key technologies,” Syst. Eng. Electron., vol. 43, pp. 2911–2917, Oct. 2021.

[12]

K. Zhang and W. N. Hao, “Wargame key point reasoning method based on genetic fuzzy system,” Syst. Eng. Electron., vol. 42, no. 10, pp. 2303–2311, Oct. 2020.

[13]

J. L. Xu, H. D. Zhang, D. H. Zhao, and W. C. Ni, “Tactical maneuver strategy learning from land wargame replay based on convolutional neural network,” J. System Simul., vol. 34, no. 10, pp. 2181–2193, Jun. 2022.

[14]

L. Chen, X. X. Liang, Y. H. Feng, L. F. Zhang, J. Yang, and Z. Liu, “Online intention recognition with incomplete information based on a weighted contrastive predictive coding model in wargame,” IEEE Trans. Neural Netw. Learn. Syst., vol. 34, no. 10, pp. 7515–7528, Oct. 2022.

[15]

L. W. Dong, N. Li, H. T. Yuan, and G. H. Gong, “Accelerating wargaming reinforcement learning by dynamic multi-demonstrator ensemble,” Inf. Sci., vol. 648, Nov. 2023, Art. no.

[16]

L. Chen, Y. L. Zhang, Y. H. Feng, L. F. Zhang, and Z. Liu, “A human-machine agent based on active reinforcement learning for target classification in wargame,” IEEE Trans. Neural Netw. Learn. Syst., vol. 35, no. 7, pp. 9858–9870, Jul. 2024.

[17]

Y. X. Sun et al., “Intelligent decision-making and human language communication based on deep reinforcement learning in a wargame environment,” IEEE Trans. Hum.-Mach. Syst., vol. 53, no. 1, pp. 201–214, Feb. 2023.

[18]

M. S. Chen, Y. Liu, Y. Lv, and T. Y. Li, “Exponential correlation dynamic threat assessment under multi-type heterogeneous data,” in 3rd Int. Conf. Comput. Sci. Commun. Technol., Dec. 2022, pp. 1200–1209.

[19]

Y. F. Xue, Y. X. Sun, J. W. Zhou, L. S. Peng, and X. Z. Zhou, “Multi-attribute decision-making in wargames leveraging the entropy-weight method in conjunction with deep journal learning,” IEEE Trans. Games, vol. 16, no. 1, pp. 151–161, Mar. 2024.

[20]

L. S. Peng, X. Z. Zhou, J. J. Zhao, Y. X. Sun, and H. X. Li, “Three-way multi-attribute decision making under incomplete mixed environments using probabilistic similarity,” Inf. Sci., vol. 614, pp. 432–463, Oct. 2022.

Digital Library

[21]

S. Zhong, J. Tan, H. S. Dong, X. M. Chen, S. R. Gong, and Z. J. Qian, “Modeling-learning-based actor-critic algorithm with Gaussian process approximator,” J. Grid Comput., vol. 18, pp. 181–195, Apr. 2020.

Digital Library

[22]

M. L. Littman, “Reinforcement learning improves behaviour from evaluative feedback,” Nature, vol. 521, no. 7553, pp. 445–451, May 2015.

[23]

R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction. Cambridge, MA, USA: MIT Press, Feb. 2018.

Digital Library

[24]

A. Hussain, J. Chun, and M. Khan, “A novel multicriteria decision making (MCDM) approach for precise decision making under a fuzzy environment,” Soft Comput., vol. 25, no. 7, pp. 5645–5661, Jan. 2021.

Digital Library

[25]

S. Opricovic and G. H. Tzeng, “Compromise solution by MCDM methods: A comparative analysis of VIKOR and TOPSIS,” Eur. J. Oper. Res., vol. 156, no. 2, pp. 445–455, Jul. 2004.

[26]

G. Kou, Y. Q. Lu, Y. Peng, and Y. Shi, “Evaluation of classification algorithms using MCDM and rank correlation,” Int. J. Inf. Technol. Decis. Mak., vol. 11, no. 01, pp. 197–225, Apr. 2012.

[27]

D. V. Winterfeldt and G. W. Fischer, “Multi-attribute utility theory: Models and assessment procedures,” in Utility, Probability, and Human Decision Making: Selected Proceedings of an Interdisciplinary Research Conference. Berlin, Germany: Springer, Sep. 1975, pp. 47–85.

[28]

G. H. Tzeng and J. J. Huang, Multiple Attribute Decision Making: Methods and Applications. Boca Raton, FL, USA: CRC Press, Jul. 2011.

[29]

F. Feng, H. Fujita, M. I. Ali, R. R. Yager, and X. Y. Liu, “Another view on generalized intuitionistic fuzzy soft sets and related multiattribute decision making methods,” IEEE Trans. Fuzzy Syst., vol. 27, no. 3, pp. 474–488, Mar. 2019.

Digital Library

[30]

D. F. Li, “Multiattribute decision making models and methods using intuitionistic fuzzy sets,” J. Comput. System Sci., vol. 70, no. 1, pp. 73–85, Feb. 2005.

Digital Library

[31]

I. K. Vlachos and G. D. Sergiadis, “Intuitionistic fuzzy information–applications to pattern recognition,” Pattern Recognit. Lett., vol. 28, no. 2, pp. 197–206, Jan. 2007.

Digital Library

[32]

T. Senapati, G. Y. Chen, and R. R. Yager, “Aczel–Alsina aggregation 1059 operators and their application to intuitionistic fuzzy multiple attribute 1060 decision making,” Int. J. Intell. Syst., vol. 37, no. 2, pp. 1529–1551, Sep. 2022.

[33]

I. Zhelyazkov, E. Benova, and V. Atanassov, “Axial structure of a plasma column produced by a large-amplitude electromagnetic surface wave,” J. Appl. Phys., vol. 59, no. 5, pp. 1466–1472, Mar. 1986.

[34]

T. A. Krassimir and R. Parvathi, “Intuitionistic fuzzy sets,” Fuzzy Sets Syst, vol. 20, no. 1, pp. 87–96, 1986.

[35]

F. Feng, Z. S. Xu, H. Fujita, and M. Q. Liang, “Enhancing PROMETREE method with intuitionistic fuzzy soft sets,” Int. J. Intell. Syst., vol. 35, no. 7, pp. 1071–1104, Mar. 2020.

Digital Library

[36]

Z. S. Xu and R. R. Yager, “Some geometric aggregation operators based on intuitionistic fuzzy sets,” Int. J. Gen. Syst., vol. 35, no. 4, pp. 417–433, Nov. 2006.

[37]

Z. S. Xu, “Intuitionistic fuzzy aggregation operators,” IEEE Trans. Fuzzy Syst., vol. 15, no. 6, pp. 1179–1187, Dec. 2007.

Digital Library

[38]

J. F. Zhang, Q. Xue, Q. H. Chen, Y. Zhang, P. Ding, and Q. Deng, “Intelligent battlefield situation comprehension method based on deep learning in wargame,” in 2019 IEEE 1st Int. Conf. Civil Aviation Saf. Inf. Technol., 2019, pp. 363–368.

[39]

Y. X. Sun, B. Yuan, T. Zhang, B. J. Tang, W. W. Zheng, and X. Z. Zhou, “Research and implementation of intelligent decision based on a priori knowledge and DQN algorithms in wargame environment,” Electron., vol. 9, no. 10, Oct. 2020, Art. no.

[40]

S. L. Dorton, L. R. Maryeski, L. Ogren, I. T. Dykens, and A. Main, “A wargame-augmented knowledge elicitation method for the agile development of novel systems,” Systems, vol. 8, no. 3, May 2020, Art. no.

[41]

B. S. Kim, B. W. Choi, and C. S. Kim, “Methodology of battle damage assessment in the naval wargame model-forcusing on damage assessment of warship,” J. Korean Soc. Syst. Eng., vol. 17, no. 1, pp. 53–64, Jun. 2021.

[42]

T. L. Mai, H. P. Yao, Z. H. Xiong, S. Guo, and D. T. Niyato, “Multi-agent actor critic reinforcement learning based in-network load balance,” in GLOBECOM 2020-2020 IEEE Glob. Commun. Conf., Dec. 2020, pp. 1–6.

Digital Library

[43]

R. B. Guo and K. Sprague, “Replication of human operators' situation assessment and decision making for simulated area reconnaissance in wargames,” J. Defense Model. Simul., vol. 13, no. 2, pp. 213–225, Dec. 2016.

[44]

D. V. Rao, “Design and development of intelligent military training systems and wargames,” in Recent Advances in Computational Intelligence in Defense and Security. Berlin, Germany: Springer, Dec. 2016, pp. 555–603.

[45]

D. P. Kong, T. Q. Chang, N. Hao, L. Zhang, and L. B. Guo, “Multi-attribute index processing method of target threat assessment in ground combat,” Acta Automatica Sinica, vol. 47, pp. 161–172, Nov. 2021.

[46]

S. M. Chen, S. H. Cheng, and T. C. Lan, “A novel similarity measure between intuitionistic fuzzy sets based on the centroid points of transformed fuzzy numbers with applications to pattern recognition,” Inf. Sci., vol. 343, pp. 15–40, May 2016.

Digital Library

[47]

J. B. Liu, H. X. Li, B. Huang, X. Z. Zhou, and L. B. Zhang, “Similarity–divergence intuitionistic fuzzy decision using particle swarm optimization,” Appl. Soft Comput., vol. 81, Aug. 2019, Art. no.

Digital Library

[48]

L. P. Kaelbling, M. L. Littman, and A. W. Moore, “Reinforcement learning: A survey,” J. Artif. Intell. Res., vol. 4, pp. 237–285, May 1996.

Index Terms

Index terms have been assigned to the content through auto-classification.

Recommendations

Intuitionistic Fuzzy Distance Based TOPSIS Method and Application to MADM

In this paper, an intuitionistic fuzzy IF distance measure between two triangular intuitionistic fuzzy numbers TIFNs is developed. The metric properties of the proposed IF distance measure are also studied. Then, based on this IF distance, an extended ...
Deep Reinforcement Learning: From Q-Learning to Deep Q-Learning
Neural Information Processing
Abstract
As the two hottest branches of machine learning, deep learning and reinforcement learning both play a vital role in the field of artificial intelligence. Combining deep learning with reinforcement learning, deep reinforcement learning is a method ...
Reinforcement learning for fuzzy agents: application to a pighouse environment control
New learning paradigms in soft computing

Fuzzy Actor-Critic Learning (FACL) and Fuzzy Q-learning (FQL) are reinforcement learning methods based on Dynamic Programming (DP) principles. In this chapter, they are used to tune on line the conclusion part of Fuzzy Inference Systems (FIS). The only ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image IEEE Transactions on Fuzzy Systems

IEEE Transactions on Fuzzy Systems Volume 32, Issue 9

Sept. 2024

595 pages

Issue’s Table of Contents

1941-0034 © 2024 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See https://www.ieee.org/publications/rights/index.html for more information.

Publisher

IEEE Press

Publication History

Published: 01 September 2024

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 03 Mar 2025

Other Metrics

View Author Metrics

Citations

View Options

View options

Figures

Tables

Media

View Issue’s Table of Contents