Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Intuitionistic Fuzzy MADM in Wargame Leveraging With Deep Reinforcement Learning

Published: 30 July 2024 Publication History

Abstract

Presently, intelligent games have emerged as a substantial research area. Nonetheless, the slow convergence of intelligent wargame training and the low success rates of agents against specific rules present challenges. In this article, we propose a game confrontation algorithm combining the multiple attribute decision making (MADM) approach from management science and reinforcement learning (RL) technology. This integration enables us to combine the strengths of both approaches and addresses the above issues effectively. This study conducts experiments using the algorithm that integrates MADM and RL techniques to gather confrontation data from the red and blue sides within the winning-first wargame platform. The data is then analyzed using the weight calculation method of intuitionistic fuzzy numbers to determine each intelligent opponent agent's threat level from the perspective of MADM. The threat level calculated by MADM is used to construct the reward function for the red side. The simulation results demonstrate that the algorithm combining MADM and RL proposed in this study outperforms classical RL algorithms regarding intelligence. This approach effectively addresses issues, such as the convergence difficulty, caused by random initialization and the sparse rewards for agent neural networks in wargame environments with large maps. Combining the MADM method from management with the RL algorithm in control can lead to cross-disciplinary innovation in academic fields, which provides innovative research values for intelligent wargame design and RL algorithm improvements.

References

[1]
Z. J. Pang, R. Z. Liu, Z. Y. Meng, Y. Zhang, Y. Yu, and T. Lu, “On reinforcement learning for full-length game of starcraft,” in Proc. AAAI Conf. Artif. Intell., 2019, pp. 4691–4698.
[2]
D. Silver et al., “Mastering the game of go with deep neural networks and tree search,” Nature, vol. 529, no. 7587, pp. 484–489, Jan. 2016.
[3]
D. H. Ye et al., “Mastering complex control in moba games with deep reinforcement learning,” in Proc. AAAI Conf. Artif. Intell., Apr. 2020, pp. 6672–6679.
[4]
D. Silver et al., “Mastering the game of Go without human knowledge,” Nature, vol. 550, no. 7676, pp. 354–359, Oct. 2017.
[5]
N. A. Barriga, M. Stanescu, F. Besoain, and M. Buro, “Improving RTS game AI by supervised policy learning, tactical search, and deep reinforcement learning,” IEEE Comput. Intell. Mag., vol. 14, no. 3, pp. 8–18, Aug. 2019.
[6]
J. Schrittwieser et al., “Mastering Atari, Go, Chess and Shogi by planning with a learned model,” Nature, vol. 588, no. 7839, pp. 604–609, Dec. 2020.
[7]
N. A. Barriga, M. Stanescu, and M. Buro, “Combining strategic learning with tactical search in real-time strategy games,” in Proc. AAAI Conf. Artif. Intell. Interactive Digit. Entertainment, Jun. 2017, pp. 9–15.
[8]
X. F. Hu and D. W. Qi, “On problems of intelligent decision-making-how far is it from game-playing to operational command,” J. Command Control, vol. 6, no. 4, pp. 356–363, Dec. 2020.
[9]
Z. Zhang, Y. Y. Huang, Y. L. Zhang, and T. D. Chen, “Battle entity confrontation algorithm based on proximal policy optimization,” J. Nanjing Univ. Sci. Technol., vol. 45, pp. 77–83, Mar. 2021.
[10]
C. Li, Y. Y. Huang, Y. L. Zhang, and T. D. Chen, “Multi-agent decision making method based on actor-critic framework and its application in wargame,” Syst. Eng. Electron., vol. 43, no. 3, pp. 755–762, Mar. 2021.
[11]
K. Cheng, G. Chen, X. H. Yu, M. Liu, and T. H. Shao, “Knowledge traction and data-driven wargame AI design and key technologies,” Syst. Eng. Electron., vol. 43, pp. 2911–2917, Oct. 2021.
[12]
K. Zhang and W. N. Hao, “Wargame key point reasoning method based on genetic fuzzy system,” Syst. Eng. Electron., vol. 42, no. 10, pp. 2303–2311, Oct. 2020.
[13]
J. L. Xu, H. D. Zhang, D. H. Zhao, and W. C. Ni, “Tactical maneuver strategy learning from land wargame replay based on convolutional neural network,” J. System Simul., vol. 34, no. 10, pp. 2181–2193, Jun. 2022.
[14]
L. Chen, X. X. Liang, Y. H. Feng, L. F. Zhang, J. Yang, and Z. Liu, “Online intention recognition with incomplete information based on a weighted contrastive predictive coding model in wargame,” IEEE Trans. Neural Netw. Learn. Syst., vol. 34, no. 10, pp. 7515–7528, Oct. 2022.
[15]
L. W. Dong, N. Li, H. T. Yuan, and G. H. Gong, “Accelerating wargaming reinforcement learning by dynamic multi-demonstrator ensemble,” Inf. Sci., vol. 648, Nov. 2023, Art. no.
[16]
L. Chen, Y. L. Zhang, Y. H. Feng, L. F. Zhang, and Z. Liu, “A human-machine agent based on active reinforcement learning for target classification in wargame,” IEEE Trans. Neural Netw. Learn. Syst., vol. 35, no. 7, pp. 9858–9870, Jul. 2024.
[17]
Y. X. Sun et al., “Intelligent decision-making and human language communication based on deep reinforcement learning in a wargame environment,” IEEE Trans. Hum.-Mach. Syst., vol. 53, no. 1, pp. 201–214, Feb. 2023.
[18]
M. S. Chen, Y. Liu, Y. Lv, and T. Y. Li, “Exponential correlation dynamic threat assessment under multi-type heterogeneous data,” in 3rd Int. Conf. Comput. Sci. Commun. Technol., Dec. 2022, pp. 1200–1209.
[19]
Y. F. Xue, Y. X. Sun, J. W. Zhou, L. S. Peng, and X. Z. Zhou, “Multi-attribute decision-making in wargames leveraging the entropy-weight method in conjunction with deep journal learning,” IEEE Trans. Games, vol. 16, no. 1, pp. 151–161, Mar. 2024.
[20]
L. S. Peng, X. Z. Zhou, J. J. Zhao, Y. X. Sun, and H. X. Li, “Three-way multi-attribute decision making under incomplete mixed environments using probabilistic similarity,” Inf. Sci., vol. 614, pp. 432–463, Oct. 2022.
[21]
S. Zhong, J. Tan, H. S. Dong, X. M. Chen, S. R. Gong, and Z. J. Qian, “Modeling-learning-based actor-critic algorithm with Gaussian process approximator,” J. Grid Comput., vol. 18, pp. 181–195, Apr. 2020.
[22]
M. L. Littman, “Reinforcement learning improves behaviour from evaluative feedback,” Nature, vol. 521, no. 7553, pp. 445–451, May 2015.
[23]
R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction. Cambridge, MA, USA: MIT Press, Feb. 2018.
[24]
A. Hussain, J. Chun, and M. Khan, “A novel multicriteria decision making (MCDM) approach for precise decision making under a fuzzy environment,” Soft Comput., vol. 25, no. 7, pp. 5645–5661, Jan. 2021.
[25]
S. Opricovic and G. H. Tzeng, “Compromise solution by MCDM methods: A comparative analysis of VIKOR and TOPSIS,” Eur. J. Oper. Res., vol. 156, no. 2, pp. 445–455, Jul. 2004.
[26]
G. Kou, Y. Q. Lu, Y. Peng, and Y. Shi, “Evaluation of classification algorithms using MCDM and rank correlation,” Int. J. Inf. Technol. Decis. Mak., vol. 11, no. 01, pp. 197–225, Apr. 2012.
[27]
D. V. Winterfeldt and G. W. Fischer, “Multi-attribute utility theory: Models and assessment procedures,” in Utility, Probability, and Human Decision Making: Selected Proceedings of an Interdisciplinary Research Conference. Berlin, Germany: Springer, Sep. 1975, pp. 47–85.
[28]
G. H. Tzeng and J. J. Huang, Multiple Attribute Decision Making: Methods and Applications. Boca Raton, FL, USA: CRC Press, Jul. 2011.
[29]
F. Feng, H. Fujita, M. I. Ali, R. R. Yager, and X. Y. Liu, “Another view on generalized intuitionistic fuzzy soft sets and related multiattribute decision making methods,” IEEE Trans. Fuzzy Syst., vol. 27, no. 3, pp. 474–488, Mar. 2019.
[30]
D. F. Li, “Multiattribute decision making models and methods using intuitionistic fuzzy sets,” J. Comput. System Sci., vol. 70, no. 1, pp. 73–85, Feb. 2005.
[31]
I. K. Vlachos and G. D. Sergiadis, “Intuitionistic fuzzy information–applications to pattern recognition,” Pattern Recognit. Lett., vol. 28, no. 2, pp. 197–206, Jan. 2007.
[32]
T. Senapati, G. Y. Chen, and R. R. Yager, “Aczel–Alsina aggregation 1059 operators and their application to intuitionistic fuzzy multiple attribute 1060 decision making,” Int. J. Intell. Syst., vol. 37, no. 2, pp. 1529–1551, Sep. 2022.
[33]
I. Zhelyazkov, E. Benova, and V. Atanassov, “Axial structure of a plasma column produced by a large-amplitude electromagnetic surface wave,” J. Appl. Phys., vol. 59, no. 5, pp. 1466–1472, Mar. 1986.
[34]
T. A. Krassimir and R. Parvathi, “Intuitionistic fuzzy sets,” Fuzzy Sets Syst, vol. 20, no. 1, pp. 87–96, 1986.
[35]
F. Feng, Z. S. Xu, H. Fujita, and M. Q. Liang, “Enhancing PROMETREE method with intuitionistic fuzzy soft sets,” Int. J. Intell. Syst., vol. 35, no. 7, pp. 1071–1104, Mar. 2020.
[36]
Z. S. Xu and R. R. Yager, “Some geometric aggregation operators based on intuitionistic fuzzy sets,” Int. J. Gen. Syst., vol. 35, no. 4, pp. 417–433, Nov. 2006.
[37]
Z. S. Xu, “Intuitionistic fuzzy aggregation operators,” IEEE Trans. Fuzzy Syst., vol. 15, no. 6, pp. 1179–1187, Dec. 2007.
[38]
J. F. Zhang, Q. Xue, Q. H. Chen, Y. Zhang, P. Ding, and Q. Deng, “Intelligent battlefield situation comprehension method based on deep learning in wargame,” in 2019 IEEE 1st Int. Conf. Civil Aviation Saf. Inf. Technol., 2019, pp. 363–368.
[39]
Y. X. Sun, B. Yuan, T. Zhang, B. J. Tang, W. W. Zheng, and X. Z. Zhou, “Research and implementation of intelligent decision based on a priori knowledge and DQN algorithms in wargame environment,” Electron., vol. 9, no. 10, Oct. 2020, Art. no.
[40]
S. L. Dorton, L. R. Maryeski, L. Ogren, I. T. Dykens, and A. Main, “A wargame-augmented knowledge elicitation method for the agile development of novel systems,” Systems, vol. 8, no. 3, May 2020, Art. no.
[41]
B. S. Kim, B. W. Choi, and C. S. Kim, “Methodology of battle damage assessment in the naval wargame model-forcusing on damage assessment of warship,” J. Korean Soc. Syst. Eng., vol. 17, no. 1, pp. 53–64, Jun. 2021.
[42]
T. L. Mai, H. P. Yao, Z. H. Xiong, S. Guo, and D. T. Niyato, “Multi-agent actor critic reinforcement learning based in-network load balance,” in GLOBECOM 2020-2020 IEEE Glob. Commun. Conf., Dec. 2020, pp. 1–6.
[43]
R. B. Guo and K. Sprague, “Replication of human operators' situation assessment and decision making for simulated area reconnaissance in wargames,” J. Defense Model. Simul., vol. 13, no. 2, pp. 213–225, Dec. 2016.
[44]
D. V. Rao, “Design and development of intelligent military training systems and wargames,” in Recent Advances in Computational Intelligence in Defense and Security. Berlin, Germany: Springer, Dec. 2016, pp. 555–603.
[45]
D. P. Kong, T. Q. Chang, N. Hao, L. Zhang, and L. B. Guo, “Multi-attribute index processing method of target threat assessment in ground combat,” Acta Automatica Sinica, vol. 47, pp. 161–172, Nov. 2021.
[46]
S. M. Chen, S. H. Cheng, and T. C. Lan, “A novel similarity measure between intuitionistic fuzzy sets based on the centroid points of transformed fuzzy numbers with applications to pattern recognition,” Inf. Sci., vol. 343, pp. 15–40, May 2016.
[47]
J. B. Liu, H. X. Li, B. Huang, X. Z. Zhou, and L. B. Zhang, “Similarity–divergence intuitionistic fuzzy decision using particle swarm optimization,” Appl. Soft Comput., vol. 81, Aug. 2019, Art. no.
[48]
L. P. Kaelbling, M. L. Littman, and A. W. Moore, “Reinforcement learning: A survey,” J. Artif. Intell. Res., vol. 4, pp. 237–285, May 1996.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image IEEE Transactions on Fuzzy Systems
IEEE Transactions on Fuzzy Systems  Volume 32, Issue 9
Sept. 2024
595 pages

Publisher

IEEE Press

Publication History

Published: 30 July 2024

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 13 Nov 2024

Other Metrics

Citations

View Options

View options

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media