Nothing Special   »   [go: up one dir, main page]

CN111488992A - Simulator adversary reinforcing device based on artificial intelligence - Google Patents

Simulator adversary reinforcing device based on artificial intelligence Download PDF

Info

Publication number
CN111488992A
CN111488992A CN202010140651.6A CN202010140651A CN111488992A CN 111488992 A CN111488992 A CN 111488992A CN 202010140651 A CN202010140651 A CN 202010140651A CN 111488992 A CN111488992 A CN 111488992A
Authority
CN
China
Prior art keywords
module
simulator
information
workstation
reinforcement learning
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010140651.6A
Other languages
Chinese (zh)
Inventor
夏少杰
刘长卫
瞿崇晓
张瑞峰
高翔
李永强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
CETC 52 Research Institute
Original Assignee
CETC 52 Research Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by CETC 52 Research Institute filed Critical CETC 52 Research Institute
Priority to CN202010140651.6A priority Critical patent/CN111488992A/en
Publication of CN111488992A publication Critical patent/CN111488992A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/04Inference or reasoning models
    • G06N5/042Backward inferencing
    • AHUMAN NECESSITIES
    • A63SPORTS; GAMES; AMUSEMENTS
    • A63FCARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
    • A63F13/00Video games, i.e. games using an electronically generated display having two or more dimensions
    • A63F13/60Generating or modifying game content before or while executing the game program, e.g. authoring tools specially adapted for game development or game-integrated level editor
    • A63F13/67Generating or modifying game content before or while executing the game program, e.g. authoring tools specially adapted for game development or game-integrated level editor adaptively or by learning from player actions, e.g. skill level adjustment or by storing successful combat sequences for re-use
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/62Text, e.g. of license plates, overlay texts or captions on TV images
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09BEDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
    • G09B9/00Simulators for teaching or training purposes
    • G09B9/003Simulators for teaching or training purposes for military purposes and tactics
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09BEDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
    • G09B9/00Simulators for teaching or training purposes
    • G09B9/02Simulators for teaching or training purposes for teaching control of vehicles or other craft
    • G09B9/08Simulators for teaching or training purposes for teaching control of vehicles or other craft for teaching control of aircraft, e.g. Link trainer
    • AHUMAN NECESSITIES
    • A63SPORTS; GAMES; AMUSEMENTS
    • A63FCARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
    • A63F2300/00Features of games using an electronically generated display having two or more dimensions, e.g. on a television screen, showing representations related to the game
    • A63F2300/60Methods for processing data by generating or executing the game program
    • A63F2300/6027Methods for processing data by generating or executing the game program using adaptive systems learning from user actions, e.g. for skill level adjustment

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Multimedia (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Educational Technology (AREA)
  • Educational Administration (AREA)
  • Business, Economics & Management (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Evolutionary Biology (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Aviation & Aerospace Engineering (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Processing Or Creating Images (AREA)

Abstract

The invention discloses a simulator opponent reinforcing device based on artificial intelligence, which comprises a deep reinforcement learning module, a workstation module and an adaptation module, wherein the workstation module is provided with a self-game confrontation training engine and sends data obtained by self-game confrontation training to the deep reinforcement learning module; the deep reinforcement learning module trains an agent by adopting the received data; the workstation module is connected with the simulator through the adaptation module, so that the simulator operation object and the intelligent agent perform confrontation simulation. The invention utilizes a uniform technical framework, can realize networking man-machine fight in various types of simulators, can automatically complete information acquisition and further promotes the continuous upgrading of the intelligent agent.

Description

Simulator adversary reinforcing device based on artificial intelligence
Technical Field
The invention belongs to the technical field of artificial intelligence confrontation and simulation deduction, and particularly relates to a simulator opponent reinforcing device based on artificial intelligence.
Background
Various simulators are often used to model a true confrontational environment, whether for games or training of special personnel. For the flight simulator, the flight simulator can approach reality to a great extent through simulation of a real airplane, including approximation of a dynamic model, environmental simulation, a cockpit, operation parts and the like. Under the condition of low cost, the aircraft crew can continuously carry out flight training through the flight simulator, and the flight simulator has the advantages of safety, reliability, economy and high working efficiency, and has a positive effect on improving the flight level of pilots.
However, the upgrade and update of the existing simulator are mainly updated one by one through original factories, and as the sources of the factories of the flight simulator are various, the upgrade difficulty and investment of the operation are extremely high.
The existing simulator generally aims at flight simulation, and can partially realize simple autonomous behaviors such as cruising and fixed-point flight; and a flight simulator can select a simulation opponent to carry out air combat confrontation training, but the confrontation algorithm mostly adopts simple expert knowledge, and is low in confrontation level and poor in adaptability under different scenes.
The defects of the existing flight simulator are mainly reflected in the following aspects by taking air combat as a training purpose: the intelligent level of the virtual opponent is not enough, the antagonism experience is not enough, and the strong enemy antagonism cannot be realized to achieve the intelligent auxiliary purpose; the simulator system is closed, so that the outside is difficult to interact with the simulator system, the expansion is inconvenient, and the further upgrading can not be carried out according to the requirement; the different simulators have greatly different hardware and software designs, high maintenance cost and incapability of being transplanted, and time and labor are wasted in customized upgrading.
In view of the above disadvantages, upgrading the existing simulator urgently needs to solve the following problems: (1) the upgrading scheme is easy to copy, has wide adaptability and does not depend on the type of a simulator; (2) the intelligent level of the game opponents is high, and the game opponents are more close to actual combat by taking air combat as guidance; (3) the maintenance cost is low, and the transplantation is convenient.
Disclosure of Invention
The invention aims to provide an artificial intelligence-based simulator opponent reinforcing device which is used for improving the antagonism of games or training and providing high-intelligence imaginary opponents for operators.
In order to achieve the purpose, the technical scheme of the application is as follows:
an artificial intelligence-based simulator opponent augmentation device, the artificial intelligence-based simulator opponent augmentation device comprising: the simulator opponent reinforcing device based on artificial intelligence comprises a deep reinforcement learning module, a workstation module and an adaptation module, wherein the working modes of the simulator opponent reinforcing device based on artificial intelligence comprise a training mode and an inference mode, wherein:
when the intelligent game machine works in a training mode, the workstation module is provided with a self-game confrontation training engine and sends data obtained by self-game confrontation training to the deep reinforcement learning module;
the deep reinforcement learning module trains an agent by adopting the received data;
when the intelligent agent is in an inference mode, the workstation module is connected with the simulator through the adaptation module so that the simulator operation object and the intelligent agent can perform confrontation simulation, the adaptation module acquires screen display information from the simulator for recognition, and sends recognized attitude and posture information of the simulator operation object to the workstation module;
the workstation module sends the attitude information of the simulator operation object and the attitude information of the intelligent body to the deep reinforcement learning module, the deep reinforcement learning module generates decision information and sends the decision information to the workstation module, and the workstation module sends the attitude information of the intelligent body to the adaptation module after executing the decision;
and the adaptation module generates an image containing the attitude of the intelligent body according to the attitude information of the intelligent body and sends the image to the simulator for screen display.
Further, the simulator opponent reinforcing device based on artificial intelligence further comprises a data management module, wherein the data management module is used for storing sample data generated by a training mode and an inference mode and storing a trained intelligent agent model; and the data management module is respectively connected with the workstation module and the deep reinforcement learning module.
Further, the simulator opponent reinforcing device based on artificial intelligence further comprises an efficiency evaluation module, wherein the efficiency evaluation module is connected with the workstation module and is used for evaluating the training result.
Further, the deep reinforcement learning module trains the agent by adopting a DDPG algorithm.
Further, the adaptation module acquires screen display information from the simulator for recognition, and adopts OCR to recognize images acquired from the screen display information and acquire attitude and posture information of the simulator operation object.
The simulator adversary strengthening device based on artificial intelligence that this application provided constructs a simulation environment to the air battle through the workstation module, realizes the butt joint with the line simulator through the adaptation module, lets the flight simulator can operate in the simulation environment of structure. Through the deep reinforcement learning module, the training generation of the intelligent agent on the strong enemy of the opponent is realized, and a strong AI opponent is provided for simulation training personnel. The technical scheme of the application has the advantages that: (1) by utilizing a uniform technical architecture, networking man-machine fight can be realized in various types of flight simulators; (2) the method takes continuous image high-speed identification as a core, can extract the airplane state by simply utilizing video information output by a simulator, inputs the airplane state into an intelligent air combat engine, and simultaneously displays a virtual enemy target in the engine in a display of an original simulator in an overlapping manner, so that a flight simulator is upgraded into an intelligent air combat simulator, and the training level of air combat tactics is promoted to be improved; (3) in the man-machine fighting process, tactical information acquisition can be automatically completed, and continuous upgrading of the intelligent air combat system is further promoted.
Drawings
FIG. 1 is a schematic structural diagram of an artificial intelligence-based simulator opponent reinforcing device according to the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
In one embodiment, as shown in FIG. 1, an artificial intelligence based simulator opponent augmentation device includes: the simulator opponent reinforcing device based on artificial intelligence comprises a deep reinforcement learning module, a workstation module and an adaptation module, wherein the working modes of the simulator opponent reinforcing device based on artificial intelligence comprise a training mode and an inference mode, wherein:
when the intelligent game machine works in a training mode, the workstation module is provided with a self-game confrontation training engine and sends data obtained by self-game confrontation training to the deep reinforcement learning module;
the deep reinforcement learning module trains an agent by adopting the received data;
when the intelligent agent is in an inference mode, the workstation module is connected with the simulator through the adaptation module so that the simulator operation object and the intelligent agent can perform confrontation simulation, the adaptation module acquires screen display information from the simulator for recognition, and sends recognized attitude and posture information of the simulator operation object to the workstation module;
the workstation module sends the attitude information of the simulator operation object and the attitude information of the intelligent body to the deep reinforcement learning module, the deep reinforcement learning module generates decision information and sends the decision information to the workstation module, and the workstation module sends the attitude information of the intelligent body to the adaptation module after executing the decision;
and the adaptation module generates an image containing the attitude of the intelligent body according to the attitude information of the intelligent body and sends the image to the simulator for screen display.
In this embodiment, pilot training is taken as an example for explanation, and the method is also applicable to the reinforcement of other simulator artificial intelligence opponents, for example, the reinforcement of an artificial intelligence opponent in a game, and details are not described below. In the present embodiment, an artificial intelligence opponent is referred to as an agent, and an object that is manually operated on a simulator is referred to as a simulator operation object. In the pilot training, the agent performs the countermeasure drilling as the blue army in the air combat countermeasure, and the simulated airplane participating in the pilot operation of the simulated countermeasure as the red army (simulator operation object).
In this embodiment, the workstation module is internally provided with a self-game countermeasure training engine, and can also be externally connected with the self-game countermeasure training engine, if pilot training is performed, the self-game countermeasure training engine can be a simulated air combat engine, various functions of simulated air combat simulation are realized, the functions include various elements such as an air combat scene and an airplane, and the simulated air combat engine can perform personalized configuration on the air combat scene and the airplane type. The workstation module is the pivot of the device, and serves as a bridge among all modules, and supports scene selection, airplane parameter configuration and the like. The method can realize the definition of air battle scene, the editing of scene parameters, the loading of scene, the definition of airplane model, the editing of airplane model parameters, the loading of airplane model, the configuration of victory or defeat rule parameters and the like.
The deep reinforcement learning module integrates algorithms related to air combat, including reinforcement learning, expert rules and the like, by taking pilot training as an example, and can realize classical tactical methods of air combat such as cross attack, unilateral attack, bilateral attack and the like.
The deep reinforcement learning module of the embodiment trains the agent based on the DDPG algorithm, which is called a deep diagnostic Policy Gradient, and the DDPG algorithm is a relatively mature algorithm and is not described herein again. Assuming a state stFor the fighting situation observed by the agent in the current state, atDecision-making actions to be performed by agents in the current situation of combat, rtInstant rewards, s, obtained for agents after performing decision-making actionst+1And after the intelligent agent executes the decision-making action, entering the fighting situation returned by the next state environment. The intelligent agent carries out the engagement with the air battle environment through a series of s, a and rThe goal is to maximize the jackpot. Under the current state s, the intelligent agent calculates a decision behavior a according to the strategy mu-mu (a-s)tWill decide action atAnd executing in the simulation environment until the end of the war office.
In the training mode, the training can be started from zero, and the existing model can be loaded to continue training. A self-game confrontation training engine is arranged in the workstation module, and provides a simulation environment for air combat simulation, including scene modeling, airplane modeling, reward function modeling and the like. The purpose of scene modeling is to establish organizational relationships among airplanes, such as scene assumptions of 2v2 and 4v2, battle rule design and the like. The airplane modeling comprises basic element modeling and maneuvering capability modeling, wherein the basic element modeling is an important part for constructing a simulation environment and comprises a detection radar, a photoelectric radar, a missile, a jamming bomb, a jamming pod and the like; maneuvering ability modeling is the basis of air combat simulation environment construction and tactical strategy design, and for an air combat game intelligent engine, state information of a virtual airplane, such as position, angle, speed, radar state, missile state and the like, is the basis for influencing decision making. The reward function is a standard for the intelligent agent to judge the self performance, and the establishment of a reasonable reward function R(s) is a key for reinforcement learning. The reward function should take into account the realizability and sparsity comprehensively, and achieve the balance between tactical implementation and exploration. The reward function is not uniquely shaped and the distribution of positive and negative rewards is preferably balanced across the range of argument for value optimization.
The workstation module returns data of the self-game confrontation training, such as states and rewards to the deep reinforcement learning module, the deep reinforcement learning module returns decision behaviors to the workstation module, and the workstation module executes decisions, so that a stronger confrontation opponent agent is trained.
In the reasoning mode, the workstation module of the embodiment is connected with the simulator through the adaptation module, and the simulator is a flight simulator when the pilot trains for example. The adaptation module is used for adapting different simulators and mainly comprises an image recognition module and an image processing module. The pilot operates on the flight simulator, and the pilot usually observes the information of the pilot and the information of the azimuth angle, the distance and the like of the enemy plane in the radar detection range in real time through a vision system of the flight simulator so as to carry out the next flight operation. Similarly, for the intelligent agent, the situation information of the current air combat environment also needs to be obtained in real time, and the next flight operation is output. However, since the existing flight simulators are various and there is not necessarily an interface for transmitting flight data of the flight simulator to the outside of the flight simulator, how to obtain the flight data from the flight simulator is a key problem.
According to the embodiment, attitude and attitude information such as the position, the direction angle and the like of the pilot plane is obtained by continuously identifying instrument panel information in the pilot vision system at a high speed. In the air combat countermeasure process, in order to give the pilot the feeling of real flight, the vision system can present corresponding pictures according to the current environment, the change of the operating rod and the current motion state of the airplane, for example, the instrument panel cannot be seen clearly due to strong light, and effective information is shielded when the operating rod is moved. There are different countermeasures to this problem:
a) and performing split screen processing on the main instrument panel, capturing screen picture images in real time, and performing image recognition to obtain information transmitted by the instrument panel. The information in the instrument panel is identified simply and accurately, and the acquired situation information is stable. A disadvantage is that the flight simulator needs to provide the ability to split the dashboard.
b) And directly identifying a visual system, namely a picture of the visual angle of a pilot. For the condition of effective information shielding, continuous data provided by continuous image identification is subjected to data analysis to obtain the effective information shielding, and algorithms such as interpolation, prediction and the like are included. The method has the advantages that the method does not depend on whether the flight simulator can provide corresponding external interfaces or not, and the universality is stronger. The disadvantages are: the situation information is not stably obtained, corresponding algorithms need to be formulated according to different abnormal conditions of the picture, and if a deep learning method with strong generalization capability is adopted, a large number of training samples need to be obtained.
The image recognition module of the embodiment recognizes the image through OCR and acquires attitude and posture information. OCR (optical character recognition) is a technique of determining the approximate shape of the image content by detecting patterns of light and shade, and then converting the shape into computer text by a character recognition method. OCR typically performs the following operations on an image: 1) image preprocessing, such as binarization processing and the like, is used for enhancing the readability of the image; 2) detecting, namely positioning an area containing characters in a picture; 3) and identifying characters in the area.
Commonly used detection methods include: a) a connected domain based approach; 2) a sliding window based approach; 3) a method based on deep learning. The method based on the connected domain considers that the characters in the image appear in the form of the connected domain, so that the method firstly extracts all the connected domains in the image as alternatives, and then selects the corresponding character connected domain according to judgment of a classifier or some rules. The method based on the sliding window is to extract the characteristics of the image covered by the sliding window, and input the image into a pre-scoring classifier to judge whether the current area contains characters. The detection method based on deep learning is generally used for the situation that the scene is more complicated. By selecting labeled data for the text portion in a large number of boxes, a multi-tiered learnable network is used to fit the mapping between the data and the labels, boxes by minimizing a predefined loss function.
And identifying the detected target area. In principle, the method can be roughly divided into two types: template matching, feature extraction and classifier. The traditional feature extraction is usually to extract edge information, gray scale features, computer features, etc. of the target region, and commonly used classifiers include a support vector machine, nearest neighbor, etc. Recently, the hot deep learning is another type of feature extraction + classifier model. The method comprises the steps of automatically extracting features which are easy to classify through an end-to-end mode, a multidimensional fitting space, a predefined loss function and a large amount of labeled data. The method is widely used through the characteristics of self-extracting characteristics, higher identification precision and strong generalization capability.
The situation information that the image recognition module among this embodiment adaptation module can discern simulator operation object (red is equal) in the simulator passes to the workstation module, and image processing module can be sent the simulator with intelligent agent (blue army) situation information stack to radar situation picture, and then realizes man-machine confrontation.
The workstation module sends the attitude information of the simulator operation object and the attitude information of the intelligent body to the deep reinforcement learning module, the deep reinforcement learning module outputs decision-making actions according to the current attitude, generates decision-making information and sends the decision-making information to the workstation module, the workstation module updates the state of the intelligent body in a simulation environment after executing the decision-making, and sends the attitude information of the intelligent body to the adaptation module.
The adaptation module generates an image containing the attitude of the intelligent body according to the attitude information of the intelligent body and sends the image to the simulator for screen display, and the pilot carries out man-machine game operation after seeing the blue army attitude displayed on the screen.
In one embodiment, the simulator opponent reinforcing device based on artificial intelligence further comprises a data management module, wherein the data management module is used for storing sample data generated by a training mode and an inference mode and storing a trained intelligent agent model; and the data management module is respectively connected with the workstation module and the deep reinforcement learning module. In the training mode, the workstation module generates confrontation sample data through a built-in self-game confrontation training engine, and the generated confrontation sample data is sent to the deep reinforcement learning module for training and simultaneously the same data is sent to the data management module for persistent storage; meanwhile, in the training process, the deep reinforcement learning module sends the intelligent model to the data management module irregularly according to the convergence condition of the intelligent model training, and the data management module performs unified storage and management according to the description of the model. In the reasoning mode, the workstation module receives the real-time countermeasure data sent by the adaptation module, and sends the data to the data management module for persistent storage while making a decision.
Sample data, intelligent agent models and the like generated in the training and reasoning processes are uniformly managed by the data management module, and can be stored, loaded, selected, copied and the like through the workstation module.
In another embodiment, the simulator opponent strengthening device based on artificial intelligence further comprises an efficiency evaluation module, the efficiency evaluation module is connected with the workstation module, the workstation module sends attitude and posture information of both opponents and intelligent agent decision information to the efficiency evaluation module in the training and reasoning process, and the efficiency evaluation module performs intelligent agent decision effect evaluation and opponents attitude and superiority evaluation according to the received information and visually displays evaluation results.
The efficiency evaluation module can help a user to better understand the training progress and evaluate the training convergence condition of the intelligent agent; by modifying the difference between the parameter comparison training result of the simulation environment and the intelligent agent winning rate, the decision maker can understand the influence degree of each parameter on the final result, and the defects of each aspect of the intelligent agent are overcome. The efficiency evaluation module can evaluate the combat efficiency of each airplane and judge the strength of the intelligent agent according to the efficiency evaluation result.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (5)

1. An artificial intelligence based simulator opponent augmentation device, comprising: the simulator opponent reinforcing device based on artificial intelligence comprises a deep reinforcement learning module, a workstation module and an adaptation module, wherein the working modes of the simulator opponent reinforcing device based on artificial intelligence comprise a training mode and an inference mode, wherein:
when the intelligent game machine works in a training mode, the workstation module is provided with a self-game confrontation training engine and sends data obtained by self-game confrontation training to the deep reinforcement learning module;
the deep reinforcement learning module trains an agent by adopting the received data;
when the intelligent agent is in an inference mode, the workstation module is connected with the simulator through the adaptation module so that the simulator operation object and the intelligent agent can perform confrontation simulation, the adaptation module acquires screen display information from the simulator for recognition, and sends recognized attitude and posture information of the simulator operation object to the workstation module;
the workstation module sends the attitude information of the simulator operation object and the attitude information of the intelligent body to the deep reinforcement learning module, the deep reinforcement learning module generates decision information and sends the decision information to the workstation module, and the workstation module sends the attitude information of the intelligent body to the adaptation module after executing the decision;
and the adaptation module generates an image containing the attitude of the intelligent body according to the attitude information of the intelligent body and sends the image to the simulator for screen display.
2. The artificial intelligence based simulator opponent enhancement device of claim 1, further comprising a data management module for saving sample data generated by the training mode and the inference mode, and saving the trained agent model; and the data management module is respectively connected with the workstation module and the deep reinforcement learning module.
3. The artificial intelligence-based simulator opponent enhancement device of claim 1, further comprising a performance evaluation module coupled to the workstation module for evaluating training results.
4. The artificial intelligence-based simulator opponent enhancement device of claim 1, wherein the deep reinforcement learning module employs a DDPG algorithm to train an agent.
5. The artificial intelligence based simulator opponent enhancement device of claim 1, wherein the adaptation module obtains on-screen information from the simulator for recognition, and uses OCR to recognize images obtained from the on-screen information and obtain pose information of the simulator operation object.
CN202010140651.6A 2020-03-03 2020-03-03 Simulator adversary reinforcing device based on artificial intelligence Pending CN111488992A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010140651.6A CN111488992A (en) 2020-03-03 2020-03-03 Simulator adversary reinforcing device based on artificial intelligence

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010140651.6A CN111488992A (en) 2020-03-03 2020-03-03 Simulator adversary reinforcing device based on artificial intelligence

Publications (1)

Publication Number Publication Date
CN111488992A true CN111488992A (en) 2020-08-04

Family

ID=71798144

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010140651.6A Pending CN111488992A (en) 2020-03-03 2020-03-03 Simulator adversary reinforcing device based on artificial intelligence

Country Status (1)

Country Link
CN (1) CN111488992A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112560332A (en) * 2020-11-30 2021-03-26 北京航空航天大学 Aviation soldier system intelligent behavior modeling method based on global situation information
CN112836036A (en) * 2021-03-18 2021-05-25 中国平安人寿保险股份有限公司 Interactive training method, device, terminal and storage medium for intelligent agent
CN113298260A (en) * 2021-06-11 2021-08-24 中国人民解放军国防科技大学 Confrontation simulation deduction method based on deep reinforcement learning
CN115470710A (en) * 2022-09-26 2022-12-13 北京鼎成智造科技有限公司 Air game simulation method and device
KR20240048839A (en) * 2022-10-07 2024-04-16 엘아이지넥스원 주식회사 Virtual training apparatus capable of adaptive simulation of virtual target and training method using the same

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108021754A (en) * 2017-12-06 2018-05-11 北京航空航天大学 A kind of unmanned plane Autonomous Air Combat Decision frame and method
CN109496318A (en) * 2018-07-30 2019-03-19 东莞理工学院 Adaptive game playing algorithm based on deeply study
CN109636699A (en) * 2018-11-06 2019-04-16 中国电子科技集团公司第五十二研究所 A kind of unsupervised intellectualized battle deduction system based on deeply study
CN109670596A (en) * 2018-12-14 2019-04-23 启元世界(北京)信息技术服务有限公司 Non-fully game decision-making method, system and the intelligent body under information environment
CN110428057A (en) * 2019-05-06 2019-11-08 南京大学 A kind of intelligent game playing system based on multiple agent deeply learning algorithm

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108021754A (en) * 2017-12-06 2018-05-11 北京航空航天大学 A kind of unmanned plane Autonomous Air Combat Decision frame and method
CN109496318A (en) * 2018-07-30 2019-03-19 东莞理工学院 Adaptive game playing algorithm based on deeply study
CN109636699A (en) * 2018-11-06 2019-04-16 中国电子科技集团公司第五十二研究所 A kind of unsupervised intellectualized battle deduction system based on deeply study
CN109670596A (en) * 2018-12-14 2019-04-23 启元世界(北京)信息技术服务有限公司 Non-fully game decision-making method, system and the intelligent body under information environment
CN110428057A (en) * 2019-05-06 2019-11-08 南京大学 A kind of intelligent game playing system based on multiple agent deeply learning algorithm

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
张春明主编: "《防空导弹飞行控制系统仿真测试技术》", 30 June 2014, 中国宇航出版社 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112560332A (en) * 2020-11-30 2021-03-26 北京航空航天大学 Aviation soldier system intelligent behavior modeling method based on global situation information
CN112836036A (en) * 2021-03-18 2021-05-25 中国平安人寿保险股份有限公司 Interactive training method, device, terminal and storage medium for intelligent agent
CN112836036B (en) * 2021-03-18 2023-09-08 中国平安人寿保险股份有限公司 Interactive training method and device for intelligent agent, terminal and storage medium
CN113298260A (en) * 2021-06-11 2021-08-24 中国人民解放军国防科技大学 Confrontation simulation deduction method based on deep reinforcement learning
CN115470710A (en) * 2022-09-26 2022-12-13 北京鼎成智造科技有限公司 Air game simulation method and device
KR20240048839A (en) * 2022-10-07 2024-04-16 엘아이지넥스원 주식회사 Virtual training apparatus capable of adaptive simulation of virtual target and training method using the same
KR102718368B1 (en) * 2022-10-07 2024-10-16 엘아이지넥스원 주식회사 Virtual training apparatus capable of adaptive simulation of virtual target and training method using the same

Similar Documents

Publication Publication Date Title
CN111488992A (en) Simulator adversary reinforcing device based on artificial intelligence
CN110929394B (en) Combined combat system modeling method based on super network theory and storage medium
CN113705102B (en) Deduction simulation system, deduction simulation method, deduction simulation equipment and deduction simulation storage medium for sea-air cluster countermeasure
CN112131786A (en) Target detection and distribution method and device based on multi-agent reinforcement learning
CN105678030B (en) Divide the air-combat tactics team emulation mode of shape based on expert system and tactics tactics
CN110109653B (en) A kind of land warfare wargame intelligent engine and its operation method
KR102560798B1 (en) unmanned vehicle simulator
CN109597839B (en) Data mining method based on avionic combat situation
Zacharias et al. SAMPLE: Situation awareness model for pilot in-the-loop evaluation
CN113625569A (en) Small unmanned aerial vehicle prevention and control hybrid decision method and system based on deep reinforcement learning and rule driving
CN118194691A (en) Human experience guided unmanned aerial vehicle air combat method based on deep reinforcement learning
CN116861779A (en) Intelligent anti-unmanned aerial vehicle simulation system and method based on digital twinning
CN115185294B (en) QMIX-based aviation soldier multi-formation collaborative autonomous behavior decision modeling method
Madni et al. Augmenting MBSE with Digital Twin Technology: Implementation, Analysis, Preliminary Results, and Findings
CN113469853A (en) Method for accelerating command control of fighting and artificial intelligence device
US20240320551A1 (en) Autonomous virtual entities continuously learning from experience
KR101345645B1 (en) Simulation System And Method for War Game
Liang et al. A conception of flight test mode for future intelligent cockpit
Wang et al. Research on naval air defense intelligent operations on deep reinforcement learning
CN115909027A (en) Situation estimation method and device
Xiaochao et al. A cgf behavior decision-making model based on fuzzy bdi framework
Bisantz et al. Validating methods in cognitive engineering: a comparison of two work domain models
Dimitriu et al. A Reinforcement Learning Approach to Military Simulations in Command: Modern Operations
Kushnier et al. Situation Assessment Through Collaborative Human‐Computer Interaction
He et al. Knowledge Graph Construction of System Capability in the Simulation Training Commanded with Electronic Countermeasures

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20200804