CN116700020B - Control method and system for unmanned aerial vehicle with variable sweepback wings, unmanned aerial vehicle and storage medium - Google Patents
Control method and system for unmanned aerial vehicle with variable sweepback wings, unmanned aerial vehicle and storage medium Download PDFInfo
- Publication number
- CN116700020B CN116700020B CN202311003145.2A CN202311003145A CN116700020B CN 116700020 B CN116700020 B CN 116700020B CN 202311003145 A CN202311003145 A CN 202311003145A CN 116700020 B CN116700020 B CN 116700020B
- Authority
- CN
- China
- Prior art keywords
- network
- aerial vehicle
- unmanned aerial
- sweepback
- wing
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 103
- 238000012549 training Methods 0.000 claims abstract description 43
- 230000009471 action Effects 0.000 claims description 92
- 238000011156 evaluation Methods 0.000 claims description 56
- 230000008569 process Effects 0.000 claims description 37
- 238000001514 detection method Methods 0.000 claims description 35
- 230000006870 function Effects 0.000 claims description 24
- 238000004422 calculation algorithm Methods 0.000 claims description 23
- 230000033001 locomotion Effects 0.000 claims description 23
- 238000013528 artificial neural network Methods 0.000 claims description 12
- 230000005484 gravity Effects 0.000 claims description 10
- 238000012545 processing Methods 0.000 claims description 9
- 230000002159 abnormal effect Effects 0.000 claims description 5
- 238000010276 construction Methods 0.000 claims description 4
- 238000005516 engineering process Methods 0.000 abstract description 2
- 239000003795 chemical substances by application Substances 0.000 description 25
- 230000001133 acceleration Effects 0.000 description 17
- 238000004364 calculation method Methods 0.000 description 12
- 238000010586 diagram Methods 0.000 description 12
- 230000002787 reinforcement Effects 0.000 description 10
- RZVHIXYEVGDQDX-UHFFFAOYSA-N 9,10-anthraquinone Chemical compound C1=CC=C2C(=O)C3=CC=CC=C3C(=O)C2=C1 RZVHIXYEVGDQDX-UHFFFAOYSA-N 0.000 description 5
- 230000008859 change Effects 0.000 description 5
- 238000004891 communication Methods 0.000 description 5
- 238000006243 chemical reaction Methods 0.000 description 3
- 238000004590 computer program Methods 0.000 description 3
- 238000005259 measurement Methods 0.000 description 3
- 230000006978 adaptation Effects 0.000 description 2
- 238000011217 control strategy Methods 0.000 description 2
- 230000001186 cumulative effect Effects 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000002093 peripheral effect Effects 0.000 description 2
- 238000005070 sampling Methods 0.000 description 2
- RUJBDQSFYCKFAA-UHFFFAOYSA-N Tofisopam Chemical compound N=1N=C(C)C(CC)C2=CC(OC)=C(OC)C=C2C=1C1=CC=C(OC)C(OC)=C1 RUJBDQSFYCKFAA-UHFFFAOYSA-N 0.000 description 1
- 238000009825 accumulation Methods 0.000 description 1
- 230000003044 adaptive effect Effects 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000012937 correction Methods 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 238000009795 derivation Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 230000009189 diving Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 238000005096 rolling process Methods 0.000 description 1
- 229960002501 tofisopam Drugs 0.000 description 1
- 230000007306 turnover Effects 0.000 description 1
- 238000004804 winding Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B13/00—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
- G05B13/02—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
- G05B13/04—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators
- G05B13/042—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators in which a parameter or coefficient is automatically adjusted to optimise the performance
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T90/00—Enabling technologies or technologies with a potential or indirect contribution to GHG emissions mitigation
Landscapes
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Computation (AREA)
- Medical Informatics (AREA)
- Software Systems (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Automation & Control Theory (AREA)
- Control Of Position, Course, Altitude, Or Attitude Of Moving Bodies (AREA)
Abstract
The application discloses a control method and a control system for a variable sweep wing unmanned aerial vehicle, the unmanned aerial vehicle and a storage medium, and belongs to the technical field of unmanned aerial vehicle control technology. The control method of the variable sweep wing unmanned aerial vehicle comprises the following steps: establishing an unmanned aerial vehicle kinematic model according to structural parameters and aerodynamic parameters of the variable swept-back unmanned aerial vehicle; acquiring initial flight state parameters of the unmanned aerial vehicle with the variable sweepback wings by using a kinematic model of the unmanned aerial vehicle, and inputting the initial flight state parameters into a sweepback angle decision model; training the sweepback angle decision model so as to update network parameters of a strategy network and a judgment network; and if the flight mission is received, acquiring current flight state parameters when the variable sweep wing unmanned aerial vehicle executes the flight mission, inputting the current flight state parameters into a sweep angle decision model to obtain a target sweep angle, and adjusting the wings of the variable sweep wing unmanned aerial vehicle according to the target sweep angle. The application can automatically adjust the sweepback angle of the sweepback wing-changing unmanned aerial vehicle according to the external environment, and improves the task execution efficiency and the flight performance.
Description
Technical Field
The application relates to the technical field of unmanned aerial vehicle control, in particular to a control method and system for a swept-back unmanned aerial vehicle, an unmanned aerial vehicle and a storage medium.
Background
The variable sweep wing unmanned aerial vehicle is a type of aircraft which is widely applied in the deformable unmanned aerial vehicle, and can provide rotating power for wings through a rotating unit, so that the control of the variable sweep of the wings in the flight process is completed, the wing profile structure and the pneumatic profile required in different flight stages are realized, and further, the task execution efficiency and the flight performance are improved.
In the related art, the process of executing a flight task by the variable sweep wing unmanned aerial vehicle is divided into a plurality of flight phases, and corresponding sweep angles are set for the flight phases. However, due to uncertainty in the external environment (wind direction, wind speed, air pressure), the swept-back angle is adjusted according to the flight phase, so that the swept-back wing-changing unmanned aerial vehicle cannot maintain excellent aerodynamic and flight performances.
Therefore, how to automatically adjust the sweep angle of the unmanned aerial vehicle with the sweep wing according to the external environment, and improve the task execution efficiency and the flight performance are technical problems that the person skilled in the art needs to solve at present.
Disclosure of Invention
The application aims to provide a control method and system for a variable sweep wing unmanned aerial vehicle, the unmanned aerial vehicle and a storage medium, wherein the sweep angle of the variable sweep wing unmanned aerial vehicle can be automatically adjusted according to an external environment, and the task execution efficiency and the flight performance are improved.
In order to solve the technical problems, the application provides a control method of a swept-back unmanned aerial vehicle, which comprises the following steps:
establishing an unmanned aerial vehicle kinematic model according to structural parameters and aerodynamic parameters of the variable swept-back unmanned aerial vehicle; the unmanned plane kinematic model is used for describing the influence of a sweepback angle on the flight speed, the flight height and the flight attitude;
acquiring initial flight state parameters of the variable sweepback wing unmanned aerial vehicle by utilizing the unmanned aerial vehicle kinematic model, and inputting the initial flight state parameters into a sweepback angle decision model to obtain target actions, rewarding values and new flight state parameters; the sweepback angle decision model comprises a strategy network and a judgment network;
constructing a data set containing the initial flight status parameter, the target action, the reward value, and the new flight status parameter;
training the sweepback angle decision model with the dataset so as to update network parameters of the strategy network and the evaluation network;
And if the flying task is received, acquiring current flying state parameters of the unmanned aerial vehicle with the variable swept wings when the unmanned aerial vehicle with the variable swept wings executes the flying task, inputting the current flying state parameters into the swept angle decision model to obtain target swept angles, and adjusting the wings of the unmanned aerial vehicle with the variable swept wings according to the target swept angles.
Optionally, updating network parameters of the policy network and the evaluation network includes:
and updating the current network parameters of the strategy network and the evaluation network through a gradient back propagation algorithm of the neural network.
Optionally, inputting the initial flight state parameter into a sweepback angle decision model to obtain a target action, a reward value and a new flight state parameter, including:
inputting the initial flight state parameters into a strategy network of the sweepback angle decision model so that the strategy network selects the target action according to the current strategy;
and calculating the reward value and the new flight state parameter corresponding to the target action by utilizing a judgment network of the sweepback angle decision model.
Optionally, the method further comprises:
constructing an action set for the strategy network; the action set is used for describing the corresponding relation between the target action selected by the strategy network and the sweepback angle, the target action selected by the strategy network is used as control voltage, and the control voltage is used for adjusting the sweepback angle of the wing of the variable sweepback unmanned plane.
Optionally, training the sweepback decision model using the dataset includes:
and training the sweepback angle decision model by utilizing the data set according to an abnormal strategy learning mode.
Optionally, training the sweepback decision model with the dataset to update network parameters of the policy network and the evaluation network includes:
selecting a set of parameter quaternions from the dataset; the parameter quadruple comprises initial flight state parameters, and target actions, rewarding values and new flight state parameters corresponding to the same initial flight state parameters;
training the sweepback angle decision model by using the selected parameter quadruple so as to update network parameters of the strategy network and the evaluation network;
judging whether the model total rewards of the sweepback angle decision model accord with iteration termination conditions or not;
if not, the step of selecting a set of parameter quaternion from the data set is entered.
Optionally, before inputting the initial flight state parameter into the sweepback angle decision model, the method further comprises:
network parameters of the evaluation network and the policy network in the sweepback angle decision model are randomly initialized.
The application also provides a control system of the variable sweep wing unmanned aerial vehicle, which comprises:
the kinematic modeling module is used for establishing an unmanned aerial vehicle kinematic model according to structural parameters and aerodynamic parameters of the variable sweep unmanned aerial vehicle; the unmanned plane kinematic model is used for describing the influence of a sweepback angle on the flight speed, the flight height and the flight attitude;
the parameter processing module is used for acquiring initial flight state parameters of the variable sweep wing unmanned aerial vehicle by utilizing the unmanned aerial vehicle kinematics model, inputting the initial flight state parameters into a sweep angle decision model, and obtaining target actions, rewarding values and new flight state parameters; the sweepback angle decision model comprises a strategy network and a judgment network;
a data set construction module for constructing a data set comprising the initial flight status parameter, the target action, the reward value and the new flight status parameter;
the training module is used for training the sweepback angle decision model by utilizing the data set so as to update network parameters of the strategy network and the evaluation network;
and the wing control module is used for acquiring current flight state parameters when the variable sweep wing unmanned aerial vehicle executes the flight task if the flight task is received, inputting the current flight state parameters into the sweep angle decision model to obtain a target sweep angle, and adjusting the wing of the variable sweep wing unmanned aerial vehicle according to the target sweep angle.
The application also provides an unmanned aerial vehicle, which comprises a controller, a wing sweepback turning unit and a wing;
the controller is used for acquiring current flight state parameters when the variable sweep wing unmanned aerial vehicle executes the flight task, inputting the current flight state parameters into the sweep angle decision model to obtain a target sweep angle, and controlling the wing variable sweep rotation unit to adjust the wing according to the target sweep angle;
the training process of the sweepback angle decision model comprises the following steps:
establishing an unmanned aerial vehicle kinematic model according to the structural parameters and the aerodynamic parameters of the variable sweep wing unmanned aerial vehicle; the unmanned plane kinematic model is used for describing the influence of a sweepback angle on the flight speed, the flight height and the flight attitude;
acquiring initial flight state parameters of the variable sweep wing unmanned aerial vehicle by using the unmanned aerial vehicle kinematic model, and inputting the initial flight state parameters into the sweep angle decision model to obtain target actions, rewarding values and new flight state parameters; the sweepback angle decision model comprises a strategy network and a judgment network;
constructing a data set containing the initial flight status parameter, the target action, the reward value, and the new flight status parameter;
Training the sweepback decision model with the dataset to update network parameters of the policy network and the evaluation network.
Further, the unmanned aerial vehicle further comprises a height detection module, a speed detection module and a voyage detection module which are respectively connected with the controller, wherein the controller is used for determining current flight state parameters when the variable sweep wing unmanned aerial vehicle executes the flight task according to detection data uploaded by the height detection module, the speed detection module and the voyage detection module.
The application also provides a storage medium, and a computer program is stored on the storage medium, and the control method of the swept-back unmanned aerial vehicle can be realized when the computer program is executed.
The application provides a control method of a variable sweep wing unmanned aerial vehicle, which comprises the following steps: establishing an unmanned aerial vehicle kinematic model according to structural parameters and aerodynamic parameters of the variable swept-back unmanned aerial vehicle; the unmanned plane kinematic model is used for describing the influence of a sweepback angle on the flight speed, the flight height and the flight attitude; acquiring initial flight state parameters of the variable sweepback wing unmanned aerial vehicle by utilizing the unmanned aerial vehicle kinematic model, and inputting the initial flight state parameters into a sweepback angle decision model to obtain target actions, rewarding values and new flight state parameters; the sweepback angle decision model comprises a strategy network and a judgment network; constructing a data set containing the initial flight status parameter, the target action, the reward value, and the new flight status parameter; training the sweepback angle decision model with the dataset so as to update network parameters of the strategy network and the evaluation network; and if the flying task is received, acquiring current flying state parameters of the unmanned aerial vehicle with the variable swept wings when the unmanned aerial vehicle with the variable swept wings executes the flying task, inputting the current flying state parameters into the swept angle decision model to obtain target swept angles, and adjusting the wings of the unmanned aerial vehicle with the variable swept wings according to the target swept angles.
According to the method, an unmanned aerial vehicle kinematic model is built according to structural parameters and aerodynamic parameters of the variable sweep unmanned aerial vehicle, and then initial flight state parameters are obtained based on the unmanned aerial vehicle kinematic model. The application also introduces a sweepback angle decision model comprising a strategy network and a judging network, and utilizes the sweepback angle decision model to process the initial flight state parameters to obtain target actions, rewarding values and new flight state parameters so as to train the sweepback angle decision model by utilizing a data set comprising the initial flight state parameters, the target actions, the rewarding values and the new flight state parameters, thereby enabling the sweepback angle decision model to automatically adjust the sweepback angle corresponding to the wing according to the flight state of the sweepback wing unmanned plane. The application can automatically adjust the sweepback angle of the sweepback wing-changing unmanned aerial vehicle according to the external environment, and improves the task execution efficiency and the flight performance. The application also provides a control system of the sweep-changing wing unmanned aerial vehicle, a storage medium and the unmanned aerial vehicle, which have the beneficial effects and are not repeated here.
Drawings
For a clearer description of embodiments of the present application, the drawings that are required to be used in the embodiments will be briefly described, it being apparent that the drawings in the following description are only some embodiments of the present application, and other drawings may be obtained according to the drawings without inventive effort for those skilled in the art.
Fig. 1 is a flowchart of a control method of a swept-back unmanned aerial vehicle according to an embodiment of the present application;
fig. 2 is a schematic structural diagram of a swept-back wing unmanned aerial vehicle with a swept-back angle of 0 ° according to an embodiment of the present application;
fig. 3 is a schematic structural diagram of a swept-back wing unmanned aerial vehicle with a swept-back angle of 45 ° according to an embodiment of the present application;
fig. 4 is a schematic plan view of a swept-back wing unmanned aerial vehicle according to an embodiment of the present application before and after a sweep angle conversion;
fig. 5 is a schematic partial view of a swept-back unmanned aerial vehicle according to an embodiment of the present application;
FIG. 6 is a schematic view of a portion of another swept-back unmanned aerial vehicle according to an embodiment of the present application;
FIG. 7 is a schematic diagram of a reinforcement learning control framework according to an embodiment of the present application;
fig. 8 is a schematic diagram of a sweepback angle decision model based on the madppg algorithm according to an embodiment of the present application;
fig. 9 is a schematic structural diagram of a control system of a swept-back unmanned aerial vehicle according to an embodiment of the present application.
In the above figures, 1 denotes a wing, 2 denotes a connecting rod-wing connecting shaft, 3 denotes a connecting rod, 4 denotes a driver rocker arm, 5 denotes a driver, 6 denotes a driver mounting seat, 7 denotes a wing rotating shaft, and 8 denotes a fuselage and a wing support.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present application more apparent, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments of the present application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
Referring to fig. 1, fig. 1 is a flowchart of a control method of a swept-back unmanned aerial vehicle according to an embodiment of the present application.
The specific steps may include:
s101: establishing an unmanned aerial vehicle kinematic model according to structural parameters and aerodynamic parameters of the variable swept-back unmanned aerial vehicle;
the embodiment can be applied to a controller of the variable sweep wing unmanned aerial vehicle, and can also be applied to electronic equipment connected with the variable sweep wing unmanned aerial vehicle. The structural parameters and aerodynamic parameters of the unmanned aerial vehicle with the variable swept wings can be obtained before the step, wherein the structural parameters can comprise parameters such as the size, the weight, the shape and the like of each part such as the wings, the engine body and the like, and the aerodynamic parameters comprise parameters such as aerodynamic force, wind resistance, lift force, engine thrust, wing acceleration, wing rotation speed, flight height, flight attitude, sweep angle and the like.
The aerodynamic parameters can be obtained according to the flight state log of the variable sweep unmanned aerial vehicle and are used for describing actual parameters of the variable sweep unmanned aerial vehicle in the flight process.
Based on the obtained structural parameters and aerodynamic parameters, a mass center motion equation, a translational dynamics equation, a rotational dynamics equation, a longitudinal dynamics equation and a sweepback angle state equation of the variable sweepback unmanned aerial vehicle can be generated, and an unmanned aerial vehicle kinematic model can be established based on the equations, wherein the unmanned aerial vehicle kinematic model describes the corresponding relation of the flight speed, the flight height, the flight attitude and the sweepback angle of the variable sweepback unmanned aerial vehicle in the flight process, namely the influence of the sweepback angle on the flight speed, the flight height and the flight attitude respectively.
As a possible implementation manner, the embodiment may perform longitudinal dynamics modeling according to the structural parameters and the aerodynamic parameters to obtain the unmanned aerial vehicle kinematic model.
S102: acquiring initial flight state parameters of the variable sweepback wing unmanned aerial vehicle by using an unmanned aerial vehicle kinematic model, and inputting the initial flight state parameters into a sweepback angle decision model to obtain target actions, rewarding values and new flight state parameters;
the embodiment can determine flight state parameters of the variable sweep wing unmanned aerial vehicle under a plurality of moments based on an unmanned aerial vehicle kinematic model, and takes the flight state parameters determined by the unmanned aerial vehicle kinematic model as initial flight state parameters which need to be input into a sweep angle decision model, wherein the initial flight state parameters comprise flight speed, flight height and flight attitude. The embodiment also utilizes the unmanned aerial vehicle kinematics model to calculate a preset sweepback angle (such as 16 degrees or 60 degrees) corresponding to the initial flight state parameter.
A sweepback decision model may also be constructed prior to this step, which may be a reinforcement learning model based on madddpg (Multi-Agent Deep Deterministic Policy Gradient, multi-agent depth deterministic strategy gradient) algorithm. The reinforcement learning model may include a policy network actor and a judgment network critic.
After the sweepback angle decision model is built, the embodiment can randomly initialize network parameters of a judgment network and a strategy network in the sweepback angle decision model, and then input initial flight state parameters into the sweepback angle decision model.
The sweepback angle decision model can process the initial flight state parameters according to the strategy corresponding to the current network parameters to obtain target actions, rewarding values and new flight state parameters. The target actions are control voltages of the variable sweep unmanned aerial vehicle, and the control voltages are used for adjusting the sweep angles of the wings of the variable sweep unmanned aerial vehicle, namely, each target action has a corresponding sweep angle. The reward value is used for describing the difference degree between the new sweepback angle and the preset sweepback angle, which is achieved after the target action is executed, and the smaller the difference value is, the larger the reward value is. The new flight state parameters are flight state parameters after the target actions are executed, and the new flight state parameters can be calculated by substituting a new sweepback angle into the unmanned aerial vehicle kinematics model through the sweepback angle decision model.
S103: constructing a data set containing the initial flight status parameter, the target action, the reward value, and the new flight status parameter;
the step can be used for constructing a data set based on the acquired initial flight state parameters and calculating the target action, the rewarding value and the new flight state parameters.
As a possible implementation manner, the embodiment may obtain a plurality of initial flight state parameters according to the unmanned aerial vehicle kinematic model, and further calculate a target action, a reward value and a new flight state parameter corresponding to each initial flight state parameter. Furthermore, the initial flight state parameter, the target action, the reward value and the new flight state parameter corresponding to the initial flight state parameter are taken as parameter quaternions, and the parameter quaternions are taken as granularity to construct a data set, namely the data set comprises a plurality of parameter quaternions.
S104: training the sweepback angle decision model with the dataset so as to update network parameters of the strategy network and the evaluation network;
after the data set is obtained, the data set can be used for training the sweepback angle decision model so as to update the network parameters of the strategy network and the evaluation network and optimize the strategy of the sweepback angle decision model. Specifically, the embodiment may train the sweepback angle decision model by using the data set according to an abnormal strategy learning manner. The embodiment can carry out iterative training on the sweepback angle decision model until a preset iterative stopping condition is reached.
As a possible implementation manner, the present embodiment may update the current network parameters of the policy network and the evaluation network through a gradient back propagation algorithm of the neural network.
S105: and if the flying task is received, acquiring current flying state parameters of the unmanned aerial vehicle with the variable swept wings when the unmanned aerial vehicle with the variable swept wings executes the flying task, inputting the current flying state parameters into the swept angle decision model to obtain target swept angles, and adjusting the wings of the unmanned aerial vehicle with the variable swept wings according to the target swept angles.
After receiving a flight task, the unmanned aerial vehicle with the swept back wing can fly according to the flight task, and can acquire current flight state parameters according to a preset period in the flight process, so that the current flight state parameters are input into a trained swept back angle decision model, and a target swept back angle is obtained. After the target sweepback angle is obtained, the variable sweepback wing unmanned aerial vehicle can adjust the wing so that the current sweepback angle reaches the target sweepback angle.
According to the embodiment, an unmanned aerial vehicle kinematic model is built according to structural parameters and aerodynamic parameters of the variable swept-back unmanned aerial vehicle, and then initial flight state parameters are obtained based on the unmanned aerial vehicle kinematic model. The embodiment also introduces a sweepback angle decision model comprising a strategy network and a judging network, and utilizes the sweepback angle decision model to process the initial flight state parameters to obtain target actions, rewarding values and new flight state parameters so as to train the sweepback angle decision model by utilizing a data set comprising the initial flight state parameters, the target actions, the rewarding values and the new flight state parameters, so that the sweepback angle decision model can autonomously adjust the sweepback angle corresponding to the wing according to the flight state of the sweepback unmanned plane. According to the embodiment, the sweepback angle of the sweepback wing-changing unmanned aerial vehicle can be automatically adjusted according to the external environment, and the task execution efficiency and the flight performance are improved.
As a further introduction to the corresponding embodiment of fig. 1, the sweepback decision model may process the initial flight state parameters by: inputting the initial flight state parameters into a strategy network of the sweepback angle decision model so that the strategy network selects the target action according to the current strategy; and calculating the reward value and the new flight state parameter corresponding to the target action by utilizing a judgment network of the sweepback angle decision model.
The current policy in the policy network can select the target action to be output based on the action set, so that the embodiment can also construct the action set for the policy network. The action set is used for describing the corresponding relation between the target action selected by the strategy network and the sweepback angle, the target action selected by the strategy network is used as control voltage, and the control voltage is used for adjusting the sweepback angle of the wing of the variable sweepback unmanned plane.
As a further introduction to the corresponding embodiment of fig. 1, the process of iteratively training the sweepback decision model includes:
step 1: selecting a set of parameter quaternions from the dataset;
the parameter quadruple comprises initial flight state parameters, and target actions, rewarding values and new flight state parameters corresponding to the same initial flight state parameters;
Step 2: training the sweepback angle decision model by using the selected parameter quadruple so as to update network parameters of the strategy network and the evaluation network;
step 3: judging whether the model total rewards of the sweepback angle decision model accord with iteration termination conditions or not; if yes, judging that the sweepback angle decision model is trained, and ending the flow; if not, go to step 1.
The flow described in the above embodiment is explained below by way of an embodiment in practical application.
Referring to fig. 2 and fig. 3, fig. 2 is a schematic structural diagram of the swept-back wing changing unmanned aerial vehicle according to the embodiment of the present application when the swept-back angle is 0 °, and fig. 3 is a schematic structural diagram of the swept-back wing changing unmanned aerial vehicle according to the embodiment of the present application when the swept-back angle is 45 °. Referring to fig. 4, fig. 4 is a schematic plan view of a swept-back angle conversion front and back of a swept-back wing unmanned aerial vehicle according to an embodiment of the present application, and fig. 4 shows a process of changing a swept-back angle of the swept-back wing unmanned aerial vehicle from 0 ° to 45 °. In this embodiment, the number of wings is 2, the number of connecting rod-wing connecting shafts is 2, and the number of connecting rods is 2. The two ends of the fuselage and the wing support are respectively connected with 2 wings through wing rotating shafts, the driver is powered by the driver to drive the driver rocker arms to rotate, and the wings are driven to rotate around the wing rotating shafts through the mutual cooperation of the driver rocker arms, the connecting rods and the connecting rod-wing connecting shafts.
The connecting rod-wing connecting shaft is fixed at the wing end and connected with the connecting rod, so as to realize the mechanical control of the wing changing sweepback; one end of the connecting rod is connected with the driver rocker arm, and the other end of the connecting rod is connected with the wing, so that the mechanical control of the wing changing sweepback is realized; the driver rocker arm is arranged on the driver and is driven by the driver to rotate, so that mechanical control of changing the wing sweep is realized; the driver is a mechanical power source for changing the wing to sweepback; the driver mounting seat is fixedly connected with the machine body and the wing support and is used for fixing the driver; the wing rotating shaft is fixed at the wing end and connected with the fuselage and the wing support; the fuselage and wing mounts may be mounted inside the fuselage.
Referring to fig. 5 and 6, fig. 5 is a schematic diagram of a part of a swept-back wing unmanned aerial vehicle according to an embodiment of the present application, fig. 6 is a schematic diagram of a part of another swept-back wing unmanned aerial vehicle according to an embodiment of the present application, fig. 5 shows a positional relationship between a connecting rod-wing connecting shaft and a wing rotating shaft, and fig. 6 shows a positional relationship between a driver mounting seat and a fuselage and a wing support.
The wing control device of the unmanned aerial vehicle with the swept-back wing comprises a wing swept-back rotation changing unit, a limit sensing unit, a communication unit, an auxiliary navigation unit and a controller; the wing sweepback turning unit comprises a rotating shaft, a body, a wing support, a connecting rod and a driver. The rotating shaft is fixed at the wing end and connected with the body support; one end of the connecting rod is connected with the driver rocker arm, and the other end is connected with the wing. The driver comprises a motor, a transmission part and a rocker arm; the rocker arm of the driver is connected with the connecting rod, and the acting force of the driver is applied to the wing to generate deformation action. The driver is connected with the wing through the connecting rod and the rotating shaft, the connecting rod and the driver are all arranged on the wing sweepback rotating unit, the two ends of the power input of the driver are electrically connected with the unmanned aerial vehicle power supply system, and the control signal input end of the driver is electrically connected with the controller. The limiting sensing assembly is fixed with the machine body, the two ends of the power input are electrically connected with the unmanned aerial vehicle power supply system, and the feedback signal output end is electrically connected with the controller. The communication unit comprises a data processing module and a communication control module, wherein both ends of a power supply input are electrically connected with the unmanned aerial vehicle power supply system, and both ends of a signal end are electrically connected with the controller. The auxiliary navigation unit comprises a height speed calculation module, a navigation attitude detection module and a navigation route module, wherein both ends of a power supply input are electrically connected with the unmanned aerial vehicle power supply system, and the signal ends are electrically connected with the controller. The controller, power input end and unmanned aerial vehicle power supply system electric connection, signal end and unmanned aerial vehicle task control system electric connection.
The driver is a driving module for converting a wing sweep control instruction into a wing rotation acting force and outputting the wing rotation acting force, and the limit sensing component is a limit sensing signal calculation and feedback module for acquiring a wing sweep angle; the data processing module is a component for receiving a control instruction of the controller, and transmitting data storage and processing of the driver by using an SPI (Seriel Peripheral Interface, serial peripheral interface) interface after finishing data processing; the altitude speed calculation module is used for respectively calculating the data of the acceleration sensor, the air pressure sensor and the inertia measurement unit and the data of the aircraft angular speed, the radio altimeter and the GPS (Global Positioning System ) receiving module to obtain the acceleration, the altitude and the speed and the corresponding correction method, and calculating to obtain an accurate updating module of the acceleration, the altitude and the speed; the navigation attitude detection module is a resolving module for processing unmanned aerial vehicle acceleration, unmanned aerial vehicle angular speed, unmanned aerial vehicle height and unmanned aerial vehicle position information acquired by the GPS receiving module according to a novel Kalman algorithm so as to obtain unmanned aerial vehicle navigation attitude information.
The swept-back wing changing unmanned aerial vehicle is further provided with a high speed calculation module for confirming the space position of the controlled unmanned aerial vehicle. The high speed calculation module comprises an acceleration sensor, a gyroscope component (i.e. an inertia measurement unit) and an air pressure sensor, can integrate different sensors and components according to module functions, performs optimal design, adopts an IIC protocol for driving, and is connected with an SDA (data line) port of the air pressure sensor through an IIC (Inter-Integrated Circuit, integrated circuit bus) interface. The flight state data provided by the altitude speed calculation module is used for assisting in confirming the change amounts of the flight speed, the flight height and the like caused by the action of the deformation control command.
The swept-back wing-changing unmanned aerial vehicle is further provided with a navigation gesture detection module, wherein the navigation gesture detection module is an inertial measurement device and can output three-dimensional gesture data, three-dimensional angular velocity, three-dimensional acceleration and three-dimensional magnetic field. The navigation attitude detection module is used for judging the heading and azimuth information of the controlled unmanned aerial vehicle, providing real-time heading and attitude information and being used for assisting in detecting that the controlled unmanned aerial vehicle provides necessary attitude feedback for an inner loop of the control system in the process of turning back the wing. The navigation attitude detection module comprises a gyroscope, an accelerometer, a magnetometer and other sensors, a radio altitude detection element, a navigation attitude information display element, a navigation attitude resolving module and a positioning board card unit (namely a GPS receiving module); the navigation attitude detection module takes a main control chip as a core, adopts an SPI communication interface to communicate with an MCU (Microcontroller Unit, micro control unit) and adopts a UART (Universal Asynchronous Receiver/Transmitter ) communication interface to communicate with a positioning module. The navigation attitude detection module is used for detecting attitude information under the condition of unmanned aerial vehicle dynamic state, and comprises data such as turnover pitching angle, deflection angle, angular velocity, acceleration and the like. The navigation attitude detection module can adopt 3.3V voltage to supply power, and devices such as a gyroscope, an accelerometer and a magnetometer can be calibrated and tested one by one according to a temperature range in the module due to temperature influence, and temperature compensation is performed, so that the detection precision is improved. The navigation attitude detection module is used for independently designing and integrating the navigation attitude detection as an independent module, so that the reliability and stability of the unmanned aerial vehicle system for acquiring the navigation attitude state information are improved.
The variable sweep wing unmanned aerial vehicle not only can adopt the wing deformation flight function with variable sweep angles, but also can have the flight function of a straight wing state with fixed sweep angles. On the basis of not changing other structures such as a fuselage, a wing and the like, the embodiment combines the learning of the flight control of the unmanned aerial vehicle in the flight state of variable sweepback angles of the wing, the learning of the flight control of the unmanned aerial vehicle in the altitude and the attitude, and realizes the flight control of the unmanned aerial vehicle in different flight phases and the improvement of the flight performance. In the embodiment, the fuselage and the wings of the sweep-back wing-changing unmanned aerial vehicle are symmetrical in geometric longitudinal plane, the mass and density distribution is uniform and unchanged, and the wings on the left side and the right side can deform synchronously.
In the embodiment, a wing sweep turning unit at the joint of the wing and the fuselage is added on the basis of the fixed-wing unmanned aerial vehicle, so that the control of the wing sweep angle of the unmanned aerial vehicle in the flight process is realized; meanwhile, the change of the chord length of the wing caused by the deformation process of the wing, the change of the generated resistance and the change of the attack angle are controlled by the aileron of the wing to cooperatively solve the rolling moment caused by the asymmetry of the left lift force and the right lift force in the deformation process, so that the longitudinal control of the unmanned plane is more convenient, and the flight performance is improved.
The embodiment adopts a variable sweep wing unmanned aerial vehicle control method based on reinforcement learning, which is used for solving the problem of autonomous deformation decision of the variable sweep wing unmanned aerial vehicle in the dive and maneuvering stages, so that the unmanned aerial vehicle can autonomously select a corresponding preset sweep angle according to the flight state, effectively realize the variable sweep wing unmanned aerial vehicle control, and obtain the optimal flight performance in different flight environments. The control method of the variable sweep wing unmanned aerial vehicle based on reinforcement learning comprises a modeling process of an unmanned aerial vehicle kinematic model and a reinforcement learning process.
The process of obtaining the unmanned aerial vehicle kinematic model by carrying out longitudinal dynamics modeling on the variable swept wing unmanned aerial vehicle is as follows:
step one, establishing a coordinate system.
In the ground coordinate system, the mass center motion equation of each rigid body is as follows:
the method comprises the steps of carrying out a first treatment on the surface of the (1)
Wherein,indicate->Mass of individual rigid bodies->Indicate->Motion velocity vector of each rigid body centroid, +.>Indicating that the action is at->The resultant force vector of the rigid bodies, t, represents time, wherein +.>=0 denotes fuselage, ++>=1 indicates left swept wing, +.>=2 denotes right swept wing.
In addition, the present embodiment may provide thatThe method comprises the steps of (1) selecting an origin of a machine body coordinate system (an X-axis points forwards, a Y-axis points upwards and a Z-axis points to the right), and selecting a machine body centroid on the origin; definitions- >To the centroid of the deformable wing->Is +.>And the machine body coordinate system is kept unchanged in the process of changing the sweepback of the wing. Definitions->Deviation +.>The formula is as follows:
the method comprises the steps of carrying out a first treatment on the surface of the (formula 2).
And step two, deriving a motion equation based on translational dynamics.
The equation of motion of each motion part of the body and the wing is as follows:
the method comprises the steps of carrying out a first treatment on the surface of the (3)
Wherein,and->Respectively representing the internal forces exerted on the fuselage by the left swept wing and the right swept wing separately;Andthe reaction forces of the airframe to the left sweepback wing and the right sweepback wing are respectively shown;Representing the weight of each part->Indicate->Part of the aerodynamic force;Representing engine thrust.Indicate->The mass of the individual rigid bodies, wherein->=0 denotes fuselage, ++>=1 indicates left swept wing, +.>=2 denotes right swept wing.
Adding the equations of motion of the fuselage and the wing in equation 3 and counteracting the internal force, the following equation is obtained:
the method comprises the steps of carrying out a first treatment on the surface of the (4)
Wherein,aerodynamic force representing whole of swept wing unmanned aerial vehicle>Representing the overall weight, +.>Represents engine thrust +.>Additional forces representing wing deformation are expressed as follows:
the method comprises the steps of carrying out a first treatment on the surface of the (5)
Indicate->Additional force during the deformation of the individual wing;
Further, the expression of the additional force expands, and the result is as follows:
the method comprises the steps of carrying out a first treatment on the surface of the (6)
、And->Is->Three components in the body coordinate system, < ->Indicating acceleration in the X direction, +.>Indicating acceleration in the Y direction, +.>Indicating acceleration in the Z direction, +.>Represents the rotational speed in the X direction, +.>Represents the rotational speed in the Y direction, +.>Indicating the rotational speed in the Z direction, +.>Represents the rotational distance in the X direction, +.>Indicating the rotational distance in the Y-direction,represents the rotational distance in the Z direction, +.>Represents the angular velocity in the X direction, +.>Represents the angular velocity in the Y direction, +.>Represents the angular velocity in the Z direction, +.>Represents the angular acceleration in the X direction, +.>Represents the angular acceleration in the Y direction, +.>Indicating the angular acceleration in the Z direction.
In addition, considering that the swept-back wing has a certain influence on the longitudinal stability of the whole machine, the following relationship exists on the assumption that the mass center of the swept-back wing is in the plane of a machine body coordinate system (only longitudinal movement is considered):
;
can be used forThe simplification is as follows:
the method comprises the steps of carrying out a first treatment on the surface of the (7)
Wherein,、and->Is->Three components in the body coordinate system, simplified formula 7 +.>Acceleration of left sweepback wing +.>For the left sweepback wing rotation speed +.>Is the moment of rotation of the left sweepback wing, +.>For the centroid point of the deformable wing around the origin of the body coordinate system +. >Angular velocity of circular motion, +.>For the centroid point of the deformable wing around the origin of the body coordinate system +.>Angular acceleration of circular motion.
And thirdly, deriving a motion equation based on rotation dynamics.
Origin of coordinate system of wing winding machine bodyMoment of momentum +.>The method comprises the following steps:
the method comprises the steps of carrying out a first treatment on the surface of the (8)
Wherein,represents the radius of the rotation axis vector, superscript ++>Representing a unit infinitesimal on the wing.
The time derivation of the above can be obtained:
the method comprises the steps of carrying out a first treatment on the surface of the (9)
Wherein,indicating the surrounding->External moment exerted on each wing, +.>Indicating the rate of change of the radius of the pivot vector.
The rotation equation of the body of the unmanned aerial vehicle with the swept wings and the left and right swept wings around the mass center is as follows:
the method comprises the steps of carrying out a first treatment on the surface of the (10)
Aerodynamic moment representing the effect of aerodynamic forces on the body, < +.>Representing the turning moment of the left sweepback wing acting on the body +.>Representing the turning moment of the right sweepback wing acting on the body +.>Indicating the body movement distance->Representing the moment of the body acting on the left swept wing,/->Moment representing the action of gravity on the left swept wing,/->Aerodynamic moment representing aerodynamic action on left swept wing, < >>Representing the moment of the body acting on the right swept wing,/->Moment representing the action of gravity on right swept wing,/->Representing the aerodynamic moment of the aerodynamic force acting on the right swept wing.
Further, adding the rotation equations of the three parts of the body of the unmanned aerial vehicle with the swept wing and the left and right swept wings around the mass center to obtain the following formula:
the method comprises the steps of carrying out a first treatment on the surface of the (11)
Wherein,representing aerodynamic moment,/->Representing left and right swept wings for +.>Gravity moment sum of (2).
And fourthly, modeling longitudinal dynamics of the variable swept wing unmanned aerial vehicle.
According to the symmetry of the appearance and the centroid distribution of the variable swept-back wing unmanned aerial vehicle, when the balanced state of the stress is applied, the external force and the moment are in the balanced state, and the longitudinal dynamics equation of the variable swept-back wing unmanned aerial vehicle is as follows:
the method comprises the steps of carrying out a first treatment on the surface of the (12)
Wherein,indicate->Longitudinal moment of inertia of the individual rigid bodies,/->Representing resistance +.>,Representing lift force,(Representing the resistance coefficient>Representing lift coefficient>Represents the thrust coefficient>Indicates dynamic pressure->),,,Represents the angle with the Y-axis, < >>Represents the angle of attack->The air density is indicated as such,representing maximum cross-sectional area of variable sweep unmanned aerial vehicle,/->Representing the overall weight, +.>The angular velocity in the Y direction is indicated.
Force representing gravity in sagittal plane multiplied by moment arm,/->The moment of resistance is represented, g represents the gravitational acceleration.
The forces and moments in the above formula are expressed as:
the method comprises the steps of carrying out a first treatment on the surface of the (13)
Representing unmanned plane movement acceleration- >Indicating the rotation angular velocity of the sweepback wing of the unmanned aerial vehicle, < ->Derivative calculations representing moment of inertia.
And fifthly, establishing a sweepback angle model.
Establishing a vector diameter from the origin of a sweepback angle and a body coordinate system of a sweepback wing unmanned plane to the mass center of a wingThe expression is as follows:
the method comprises the steps of carrying out a first treatment on the surface of the (14)
Wherein,constant value related to the shape, +.>Representing the sweep angle.
According to the operational equations and the aerodynamic parameters, the control method for the sweepback of the wing adopts the reinforcement learning MADDPG algorithm and the control technology to combine, and for each flight state (such as cruising, maneuvering, diving and the like), the strategy output of the environment can be controlled by the corresponding neural network through self-organizing learning, and the optimal speed value is found through repeated iterative learning, so that a plurality of intelligent bodies can deal with high-dimensional and dynamic environments, and the corresponding optimal sweepback angle is selected in different flight stages.
The embodiment can realize the control of the variable swept wings by adopting a reinforcement learning algorithm based on the longitudinal dynamics model of the variable swept wing unmanned aerial vehicle, and the specific process is as follows:
step one, obtaining a control model with a variable sweepback angle and the flying speed of the unmanned aerial vehicle: because the influence of the wing deformation process on the attack angle is small, namely, the wing is changed into the longitudinal dynamics equation of the sweepback unmanned plane The method can be omitted, and the longitudinal dynamics equation is used for carrying out input and output linearization, so that the speed control structure is obtained as follows:
the method comprises the steps of carrying out a first treatment on the surface of the (15)/(S)>
Step two, modeling of control voltage with variable sweep angle is obtained: the dynamics of the deformation process of the wing into the sweep angle is assumed to be a simple nonlinear differential equation.A set of descriptions of the behavior of the agent for the set of actions;The reward function is the accumulation of the reward function over time steps. The current wing sweepback angle and real-time flight state of the unmanned aerial vehicle can be accurately obtained by a sweepback wing changing executing mechanism structure sensor and a navigation attitude detection module.
Step 201: according to the sweepback angle of the wingCorresponding control voltage +.>Proportional relation of->,Is the proportionality coefficient between sweep angle and control voltage, will +.>As an action set, and establishes a return function R:
the method comprises the steps of carrying out a first treatment on the surface of the (16)
Wherein:representing a discount factor;、、、Representing a time step +.>Is a reward function of->Representing the maximum number of iterations. The establishment of the return function R adopts a target network method, and policy networks are also required to be respectively established>Is->Network systemAnd two neural network copies (target network)/(target network)>And->The algorithm formula is as follows:
the method comprises the steps of carrying out a first treatment on the surface of the (17)
Wherein:representing soft update step size parameters.
Indicating the current state of the unmanned plane (including flying speed, flying speedAltitude and flight attitude parameters, etc.), -a new person>Representing sweep angle in policy network, +.>Indicates the action of->Representation->Sweepback in network, < >>Sweep angle in policy network representing target network,/->Representing +.>Sweep angle in the network.
Step 202: according to the current state of the unmanned aerial vehicleNamely, the speed vector of the unmanned aerial vehicle at the current moment decides out the action and establishes +.>Prize value:
the method comprises the steps of carrying out a first treatment on the surface of the (18)
Wherein:representing sweepback +.>And preset sweepback angle->A threshold value of the difference absolute value;For sparse rewards, the sweepback reward factor +.>Indicating when->When the intelligent agent is given a positive sparse reward, < ->1, otherwise 0.
Thirdly, acquiring a control strategy for changing the sweepback wing of the unmanned aerial vehicle by utilizing an Actor-Critic learning frame in the MADDPG algorithm according to the sweepback angle of the wing: according toAnd outputting an optimal deformation strategy to perform autonomous control, and keeping the unmanned aerial vehicle in optimal aerodynamic characteristics in the whole flight envelope. In addition, in order to improve the learning speed and learning efficiency of the algorithm, an Actor-Critic learning framework is utilized to realize an abnormal strategy learning mode, and the algorithm comprises the following steps:
Step 301: constructing a normal distribution N (b, c) with a mean value of b and a variance of c;
establishing a variance c:
the method comprises the steps of carrying out a first treatment on the surface of the (19)/(1)>
Wherein:representing the number of training rounds.
Step 302: establishing a threshold:
The method comprises the steps of carrying out a first treatment on the surface of the (20)
Step 303: to ensure adequate exploration of the algorithm, a cumulative prize is achieved by gradient computationThe action policy selects a random policy, the evaluation policy is set as a deterministic policy, and the action formula of the deterministic policy is as follows:
the method comprises the steps of carrying out a first treatment on the surface of the (21)
In deterministic strategies, cumulative rewards as objective functionsRelative to policy parameters->Gradient of->The method comprises the following steps:
the method comprises the steps of carrying out a first treatment on the surface of the (22)
Wherein: e represents an expected value. Policy gradients along the objective functionEnlarged direction adjustment policy parameter +.>. In the method, the variable sweep angle of the unmanned aerial vehicle can be optimized according to different tasks by acquiring the cooperative control strategy of the variable sweep angle of the unmanned aerial vehicle by utilizing an Actor-Critic learning frame according to the sweep angle of the wing.
Representing the sweep angle strategic gradient, +.>Representation->Policy of network->State of t time is indicated,/->Representing the motion gradient.
The madddpg algorithm is based on an Actor-Critic (policy-evaluation) and DDPG (depth deterministic policy gradient algorithm) and is a series of improvements, and a plurality of different sub-policies are summarized, and each agent training mode is similar to DDPG, but different in evaluating the input of network Critic. When training individual agents, each agent's evaluation network Critic input, in addition to its own state-action data, needs to receive additional agent data. When a plurality of agents train, the input of each agent is different, stable environments cannot be obtained among the agents, each agent can adopt different and continuously-changing strategies, all agents update own networks in the same way, and other updating processes are the same. The central evaluation network Critic of the MADDPG algorithm is to collect sub-policies of the intelligent agent 1 and the intelligent agent 2 to form a Critic Net network, and also can learn by utilizing the policy information of other intelligent agents.
The MADDPG algorithm training process is specifically expressed as follows:
(1) Random initialization evaluation networkParameter of->And policy network->Parameter of->。
(2) Copying the parameters of the current network to the corresponding target network parameters,。
(3) Initializing a buffer for experience playback。
(4) for epoode=1:maxepiode (set calculation cycle interval 1 to maximum cycle value).
(5) Initializing the observed state。
(6)for 。Representing the maximum number of iterations
(7) Selecting actions according to current policies。
(8) Executing an actionThereby obtaining rewards->And new state->。
(9) Will beStore in buffer->As a dataset for training the current network. />
(10) Threshold valueThe variance c changes.
(11) From the slaveA group of training data with the quantity of M is randomly sampled and used as mini-batch training data of a current strategy network and a current evaluation network. Use->Representing a single training data in mini-batch.
(12) Updating current network parameters of judgment network critic by gradient back propagation of neural network。
(13) Updating current network parameters of a policy network actor by gradient back propagation of a neural network using policy gradients of selected samples:
;
Updating the critic target network and the actor target network parameters: ;
(14) end for time step (time step end flag).
(15) end for ep (end of cycle flag).
Referring to fig. 7, fig. 7 is a schematic diagram of a reinforcement learning control frame according to an embodiment of the application. In the reinforcement learning process, a sampling database can be obtained through calculating the cost of the thrust of the device, an iterative learning loop is carried out on the sampling database to obtain an action value function, and the corresponding variant action is executed by the strategy control device corresponding to the action value function. And the control system of the unmanned aerial vehicle capable of changing the sweepback wing inputs errors corresponding to the planned flight path into the flight control machine, and the flight control machine control device executes variant actions and outputs sweepback angles. The fly-by-aircraft in the control system can control dynamics due to variants and output sweepback angles based on the dynamics model and the adaptive control rate.
Referring to fig. 8, fig. 8 is a schematic diagram of a sweepback angle decision model based on a madppg algorithm according to an embodiment of the present application.
The sweepback angle decision model comprises an MADDPG agent 1, an MADDPG agent 2 and a central evaluation network Critic Net.
MADDPG agent 1 is an Actor-Critic algorithm model, and the input parameters are as follows ,Representing the initial flight status parameter->Representing an action, r representing a prize value, +.>Representing new flight status parameters.
The MADDPG agent 1 comprises a Policy neural network (as a Main network Main Net), a Value neural network (as a Target network Target Net), a Policy Gradient module Policy Gradient and a loss function calculation part. The Policy neural network comprises a Policy network Actor Net and a judgment network Critic Net, and the Policy network Actor Net actsIs transmitted to the evaluation network Critic Net, and the evaluation network Critic Net will +.>(judging the Q value of the action in the Critical Net) is transferred to the Policy Gradient module Policy Gradient to update the parameters of the Policy network Actor Net. Judging network Critic Net will +.>(judging the Q value of the ith intelligent agent action in the Critical Net) is transmitted to a Loss Function calculation module Loss Function, wherein the Loss Function of the Loss Function calculation module Loss Function is that. The Value neural network comprises a strategy network Actor Net and a judgment network Critic Net, and the strategy network Actor Net acts +.>Is transmitted to the evaluation network Critic Net, and the evaluation network Critic Net will +.>(next action->The Q value of (c) is transferred to the Loss Function calculation module Loss Function so as to update the parameters of the Critic Net of the evaluation network. The Q value is used to describe the value of the action node. The structure of MADDPG entity 2 is the same as that of MADDPG entity 1, and will not be described here again.
There is interaction between the policy network Actor Net of the madddpg agent 1 and the evaluation network Critic Net of the madddpg agent 2, and there is interaction between the policy network Actor Net of the madddpg agent 2 and the evaluation network Critic Net of the madddpg agent 1.
Referring to fig. 9, fig. 9 is a schematic structural diagram of a control system of a swept-back unmanned aerial vehicle according to an embodiment of the present application, including:
the kinematic modeling module 901 is used for building an unmanned aerial vehicle kinematic model according to structural parameters and aerodynamic parameters of the swept-back unmanned aerial vehicle; the unmanned plane kinematic model is used for describing the influence of a sweepback angle on the flight speed, the flight height and the flight attitude;
the parameter processing module 902 is configured to obtain an initial flight state parameter of the swept-back wing changing unmanned aerial vehicle by using the unmanned aerial vehicle kinematic model, and input the initial flight state parameter into a swept-back angle decision model to obtain a target action, a reward value and a new flight state parameter; the sweepback angle decision model comprises a strategy network and a judgment network;
a data set construction module 903 for constructing a data set comprising the initial flight status parameter, the target action, the reward value and the new flight status parameter;
A training module 904 for training the sweepback decision model with the dataset to update network parameters of the policy network and the evaluation network;
the wing control module 905 is configured to obtain a current flight state parameter when the variable sweep wing unmanned aerial vehicle executes the flight task if the flight task is received, input the current flight state parameter into the sweep angle decision model to obtain a target sweep angle, and adjust the wing of the variable sweep wing unmanned aerial vehicle according to the target sweep angle.
According to the method, an unmanned aerial vehicle kinematic model is built according to structural parameters and aerodynamic parameters of the variable sweep unmanned aerial vehicle, and then initial flight state parameters are obtained based on the unmanned aerial vehicle kinematic model. The application also introduces a sweepback angle decision model comprising a strategy network and a judging network, and utilizes the sweepback angle decision model to process the initial flight state parameters to obtain target actions, rewarding values and new flight state parameters so as to train the sweepback angle decision model by utilizing a data set comprising the initial flight state parameters, the target actions, the rewarding values and the new flight state parameters, thereby enabling the sweepback angle decision model to automatically adjust the sweepback angle corresponding to the wing according to the flight state of the sweepback wing unmanned plane. The application can automatically adjust the sweepback angle of the sweepback wing-changing unmanned aerial vehicle according to the external environment, and improves the task execution efficiency and the flight performance.
Further, the process of the training module 904 updating the network parameters of the policy network and the evaluation network includes: and updating the current network parameters of the strategy network and the evaluation network through a gradient back propagation algorithm of the neural network.
Further, the process of inputting the initial flight state parameter into the sweepback angle decision model by the parameter processing module 902 to obtain the target action, the reward value and the new flight state parameter includes: inputting the initial flight state parameters into a strategy network of the sweepback angle decision model so that the strategy network selects the target action according to the current strategy; and calculating the reward value and the new flight state parameter corresponding to the target action by utilizing a judgment network of the sweepback angle decision model.
Further, the method further comprises the following steps:
the action set construction module is used for constructing an action set for the strategy network; the action set is used for describing the corresponding relation between the target action selected by the strategy network and the sweepback angle, the target action selected by the strategy network is used as control voltage, and the control voltage is used for adjusting the sweepback angle of the wing of the variable sweepback unmanned plane.
Further, the training module 904 training the sweepback decision model using the data set includes: and training the sweepback angle decision model by utilizing the data set according to an abnormal strategy learning mode.
Further, the training module 904 trains the sweepback angle decision model using the data set to update network parameters of the policy network and the evaluation network includes: selecting a set of parameter quaternions from the dataset; the parameter quadruple comprises initial flight state parameters, and target actions, rewarding values and new flight state parameters corresponding to the same initial flight state parameters; training the sweepback angle decision model by using the selected parameter quadruple so as to update network parameters of the strategy network and the evaluation network; judging whether the model total rewards of the sweepback angle decision model accord with iteration termination conditions or not; if not, the step of selecting a set of parameter quaternion from the data set is entered.
Further, the method further comprises the following steps:
and the initialization module is used for randomly initializing network parameters of the evaluation network and the strategy network in the sweepback angle decision model before the initial flight state parameters are input into the sweepback angle decision model.
Since the embodiments of the system portion and the embodiments of the method portion correspond to each other, the embodiments of the system portion refer to the description of the embodiments of the method portion, which is not repeated herein.
The present application also provides a storage medium having stored thereon a computer program which, when executed, performs the steps provided by the above embodiments. The storage medium may include: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
The application also provides an unmanned aerial vehicle, which comprises a controller, a wing sweepback turning unit and a wing;
the controller is used for acquiring current flight state parameters when the variable sweep wing unmanned aerial vehicle executes the flight task, inputting the current flight state parameters into the sweep angle decision model to obtain a target sweep angle, and controlling the wing variable sweep rotation unit to adjust the wing according to the target sweep angle;
the training process of the sweepback angle decision model comprises the following steps:
establishing an unmanned aerial vehicle kinematic model according to the structural parameters and the aerodynamic parameters of the variable sweep wing unmanned aerial vehicle; the unmanned plane kinematic model is used for describing the influence of a sweepback angle on the flight speed, the flight height and the flight attitude;
Acquiring initial flight state parameters of the variable sweep wing unmanned aerial vehicle by using the unmanned aerial vehicle kinematic model, and inputting the initial flight state parameters into the sweep angle decision model to obtain target actions, rewarding values and new flight state parameters; the sweepback angle decision model comprises a strategy network and a judgment network;
constructing a data set containing the initial flight status parameter, the target action, the reward value, and the new flight status parameter;
training the sweepback decision model with the dataset to update network parameters of the policy network and the evaluation network.
The unmanned aerial vehicle further comprises a height detection module, a speed detection module and a voyage detection module which are respectively connected with the controller, wherein the controller is used for determining current flight state parameters when the variable sweep wing unmanned aerial vehicle executes the flight task according to detection data uploaded by the height detection module, the speed detection module and the voyage detection module.
In the description, each embodiment is described in a progressive manner, and each embodiment is mainly described by the differences from other embodiments, so that the same similar parts among the embodiments are mutually referred. For the system disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and the relevant points refer to the description of the method section. It should be noted that it will be apparent to those skilled in the art that various modifications and adaptations of the application can be made without departing from the principles of the application and these modifications and adaptations are intended to be within the scope of the application as defined in the following claims.
It should also be noted that in this specification, relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
Claims (10)
1. The control method of the sweep-changing unmanned aerial vehicle is characterized by comprising the following steps of:
establishing an unmanned aerial vehicle kinematic model according to structural parameters and aerodynamic parameters of the variable swept-back unmanned aerial vehicle; the unmanned plane kinematic model is used for describing the influence of a sweepback angle on the flight speed, the flight height and the flight attitude;
Acquiring initial flight state parameters of the variable sweepback wing unmanned aerial vehicle by utilizing the unmanned aerial vehicle kinematic model, and inputting the initial flight state parameters into a sweepback angle decision model to obtain target actions, rewarding values and new flight state parameters; the sweepback angle decision model comprises a strategy network and a judgment network; the rewarding value is calculated by using a rewarding function, and the rewarding function is established by adopting a target network method;
constructing a data set containing the initial flight status parameter, the target action, the reward value, and the new flight status parameter;
training the sweepback angle decision model with the dataset so as to update network parameters of the strategy network and the evaluation network;
if a flight task is received, acquiring current flight state parameters of the variable swept-back wing unmanned aerial vehicle when the variable swept-back wing unmanned aerial vehicle executes the flight task, inputting the current flight state parameters into the swept-back angle decision model to obtain a target swept-back angle, and adjusting the wing of the variable swept-back wing unmanned aerial vehicle according to the target swept-back angle;
wherein, according to become the sweep unmanned aerial vehicle's structural parameter and aerodynamic parameter of back and establish unmanned aerial vehicle kinematics model, include:
Generating a mass center motion equation, a translational dynamics equation, a rotational dynamics equation, a longitudinal dynamics equation and a sweepback angle state equation of the variable sweepback unmanned aerial vehicle according to the structural parameters and the aerodynamic parameters of the variable sweepback unmanned aerial vehicle, and establishing a unmanned aerial vehicle kinematic model according to the mass center motion equation, the translational dynamics equation, the rotational dynamics equation, the longitudinal dynamics equation and the sweepback angle state equation;
wherein the longitudinal dynamics equation is:
;
represents engine thrust +.>Represents the angle of attack->Representing resistance (i.e.>Representing the gravity of the whole unmanned aerial vehicle with the sweepback wings,/->Represents the angle with the Y-axis, < >>And->Is the component of the additional force in the deformation process of the wing in the X direction and the Z direction in the machine body coordinate system, m 1 Representing left swept wing mass, +.>,v 0 For the movement speed of the fuselage, t represents time, < >>Representing lift force->Representing the longitudinal moment of inertia of the left swept wing, for example>Representing the longitudinal moment of inertia of the right swept wing,/->The angular velocity in the Y direction is indicated,force representing gravity in sagittal plane multiplied by moment arm,/->A moment representing the resistance;
wherein prior to inputting the initial flight state parameters into the sweepback decision model, further comprising:
Randomly initializing network parameters of the evaluation network and the strategy network in the sweepback angle decision model, and copying the network parameters of the evaluation network and the strategy network to corresponding target networks;
the algorithm formula of the target network corresponding to the evaluation network and the strategy network is as follows:
;sweep angle of policy network representing target network, < ->Representing the backswept angle of the strategy network in the backswept angle decision model, +.>Sweep angle of evaluation network representing target network, +.>Representing the sweepback of the evaluation network in the sweepback decision model, < >>Representing soft update step size parameters.
2. The method of claim 1, wherein updating network parameters of the policy network and the evaluation network comprises:
and updating the current network parameters of the strategy network and the evaluation network through a gradient back propagation algorithm of the neural network.
3. The method of claim 1, wherein inputting the initial flight state parameters into a sweepback decision model to obtain target actions, prize values, and new flight state parameters comprises:
Inputting the initial flight state parameters into a strategy network of the sweepback angle decision model so that the strategy network selects the target action according to the current strategy;
and calculating the reward value and the new flight state parameter corresponding to the target action by utilizing a judgment network of the sweepback angle decision model.
4. The method of controlling a swept-back unmanned aerial vehicle of claim 3, further comprising:
constructing an action set for the strategy network; the action set is used for describing the corresponding relation between the target action selected by the strategy network and the sweepback angle, the target action selected by the strategy network is used as control voltage, and the control voltage is used for adjusting the sweepback angle of the wing of the variable sweepback unmanned plane.
5. The method of claim 1, wherein training the swept angle decision model using the dataset comprises:
and training the sweepback angle decision model by utilizing the data set according to an abnormal strategy learning mode.
6. The method of claim 1, wherein training the swept angle decision model with the dataset to update network parameters of the policy network and the evaluation network comprises:
Selecting a set of parameter quaternions from the dataset; the parameter quadruple comprises initial flight state parameters, and target actions, rewarding values and new flight state parameters corresponding to the same initial flight state parameters;
training the sweepback angle decision model by using the selected parameter quadruple so as to update network parameters of the strategy network and the evaluation network;
judging whether the model total rewards of the sweepback angle decision model accord with iteration termination conditions or not;
if not, the step of selecting a set of parameter quaternion from the data set is entered.
7. A control system for a swept-back unmanned aerial vehicle, comprising:
the kinematic modeling module is used for establishing an unmanned aerial vehicle kinematic model according to structural parameters and aerodynamic parameters of the variable sweep unmanned aerial vehicle; the unmanned plane kinematic model is used for describing the influence of a sweepback angle on the flight speed, the flight height and the flight attitude;
the parameter processing module is used for acquiring initial flight state parameters of the variable sweep wing unmanned aerial vehicle by utilizing the unmanned aerial vehicle kinematics model, inputting the initial flight state parameters into a sweep angle decision model, and obtaining target actions, rewarding values and new flight state parameters; the sweepback angle decision model comprises a strategy network and a judgment network; the rewarding value is calculated by using a rewarding function, and the rewarding function is established by adopting a target network method;
A data set construction module for constructing a data set comprising the initial flight status parameter, the target action, the reward value and the new flight status parameter;
the training module is used for training the sweepback angle decision model by utilizing the data set so as to update network parameters of the strategy network and the evaluation network;
the wing control module is used for acquiring current flight state parameters when the variable sweep wing unmanned aerial vehicle executes the flight task if the flight task is received, inputting the current flight state parameters into the sweep angle decision model to obtain a target sweep angle, and adjusting the wing of the variable sweep wing unmanned aerial vehicle according to the target sweep angle;
the kinematic modeling module is specifically configured to: generating a mass center motion equation, a translational dynamics equation, a rotational dynamics equation, a longitudinal dynamics equation and a sweepback angle state equation of the variable sweepback unmanned aerial vehicle according to the structural parameters and the aerodynamic parameters of the variable sweepback unmanned aerial vehicle, and establishing a unmanned aerial vehicle kinematic model according to the mass center motion equation, the translational dynamics equation, the rotational dynamics equation, the longitudinal dynamics equation and the sweepback angle state equation;
Wherein the longitudinal dynamics equation is:
;
represents engine thrust +.>Represents the angle of attack->Representing resistance (i.e.>Representing the gravity of the whole unmanned aerial vehicle with the sweepback wings,/->Represents the angle with the Y-axis, < >>And->Is the component of the additional force in the deformation process of the wing in the X direction and the Z direction in the machine body coordinate system, m 1 Representing left swept wing mass, +.>,v 0 For the movement speed of the fuselage, t represents time, < >>Representing lift force->Representing the longitudinal moment of inertia of the left swept wing, for example>Representing the longitudinal moment of inertia of the right swept wing,/->The angular velocity in the Y direction is indicated,force representing gravity in sagittal plane multiplied by moment arm,/->A moment representing the resistance;
wherein the control system further comprises:
the initialization module is used for randomly initializing network parameters of the evaluation network and the strategy network in the sweepback angle decision model;
the control system is further used for copying network parameters of the evaluation network and the strategy network to corresponding target networks;
the algorithm formula of the target network corresponding to the evaluation network and the strategy network is as follows:
;sweep angle of policy network representing target network, < ->Representing the backswept angle of the strategy network in the backswept angle decision model, +. >Sweep angle of evaluation network representing target network, +.>Representing the sweepback of the evaluation network in the sweepback decision model, < >>Representing soft update step size parameters.
8. The unmanned aerial vehicle is characterized by comprising a controller, a wing sweepback turning unit and a wing;
the controller is used for acquiring current flight state parameters when the variable sweep wing unmanned aerial vehicle executes a flight task, inputting the current flight state parameters into a sweep angle decision model to obtain a target sweep angle, and controlling the wing variable sweep rotating unit to adjust the wing according to the target sweep angle;
the training process of the sweepback angle decision model comprises the following steps:
establishing an unmanned aerial vehicle kinematic model according to the structural parameters and the aerodynamic parameters of the variable sweep wing unmanned aerial vehicle; the unmanned plane kinematic model is used for describing the influence of a sweepback angle on the flight speed, the flight height and the flight attitude;
acquiring initial flight state parameters of the variable sweep wing unmanned aerial vehicle by using the unmanned aerial vehicle kinematic model, and inputting the initial flight state parameters into the sweep angle decision model to obtain target actions, rewarding values and new flight state parameters; the sweepback angle decision model comprises a strategy network and a judgment network; the rewarding value is calculated by using a rewarding function, and the rewarding function is established by adopting a target network method;
Constructing a data set containing the initial flight status parameter, the target action, the reward value, and the new flight status parameter;
training the sweepback angle decision model with the dataset so as to update network parameters of the strategy network and the evaluation network;
wherein, according to become the sweep unmanned aerial vehicle's structural parameter and aerodynamic parameter of back and establish unmanned aerial vehicle kinematics model, include:
generating a mass center motion equation, a translational dynamics equation, a rotational dynamics equation, a longitudinal dynamics equation and a sweepback angle state equation of the variable sweepback unmanned aerial vehicle according to the structural parameters and the aerodynamic parameters of the variable sweepback unmanned aerial vehicle, and establishing a unmanned aerial vehicle kinematic model according to the mass center motion equation, the translational dynamics equation, the rotational dynamics equation, the longitudinal dynamics equation and the sweepback angle state equation;
wherein the longitudinal dynamics equation is:
;
represents engine thrust +.>Represents the angle of attack->Representing resistance (i.e.>Representing the gravity of the whole unmanned aerial vehicle with the sweepback wings,/->Represents the angle with the Y-axis, < >>And->Is the component of the additional force in the deformation process of the wing in the X direction and the Z direction in the machine body coordinate system, m 1 Representing left swept wing mass, +.>,v 0 For the movement speed of the fuselage, t represents time, < >>Representing lift force->Representing the longitudinal moment of inertia of the left swept wing, for example>Representing the longitudinal moment of inertia of the right swept wing,/->The angular velocity in the Y direction is indicated,force representing gravity in sagittal plane multiplied by moment arm,/->A moment representing the resistance;
wherein prior to inputting the initial flight state parameters into the sweepback decision model, further comprising:
randomly initializing network parameters of the evaluation network and the strategy network in the sweepback angle decision model, and copying the network parameters of the evaluation network and the strategy network to corresponding target networks;
the algorithm formula of the target network corresponding to the evaluation network and the strategy network is as follows:
;sweep angle of policy network representing target network, < ->Representing the backswept angle of the strategy network in the backswept angle decision model, +.>Sweep angle of evaluation network representing target network, +.>Representing the sweepback of the evaluation network in the sweepback decision model, < >>Representing soft update step size parameters.
9. The unmanned aerial vehicle of claim 8, further comprising a altitude detection module, a speed detection module, and a voyage detection module respectively coupled to the controller, wherein the controller is configured to determine a current flight state parameter of the swept-back wing unmanned aerial vehicle when performing the flight mission based on detection data uploaded by the altitude detection module, the speed detection module, and the voyage detection module.
10. A storage medium having stored therein computer executable instructions which, when loaded and executed by a processor, implement the steps of the method of controlling a swept-back unmanned aerial vehicle according to any one of claims 1 to 6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311003145.2A CN116700020B (en) | 2023-08-10 | 2023-08-10 | Control method and system for unmanned aerial vehicle with variable sweepback wings, unmanned aerial vehicle and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311003145.2A CN116700020B (en) | 2023-08-10 | 2023-08-10 | Control method and system for unmanned aerial vehicle with variable sweepback wings, unmanned aerial vehicle and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN116700020A CN116700020A (en) | 2023-09-05 |
CN116700020B true CN116700020B (en) | 2023-11-24 |
Family
ID=87839705
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311003145.2A Active CN116700020B (en) | 2023-08-10 | 2023-08-10 | Control method and system for unmanned aerial vehicle with variable sweepback wings, unmanned aerial vehicle and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116700020B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118410585B (en) * | 2024-06-27 | 2024-09-13 | 西北工业大学 | High-speed aircraft game deformation method based on fuzzy threat judgment |
CN118551484B (en) * | 2024-07-24 | 2024-09-20 | 中国人民解放军军事科学院战争研究院 | Rotor unmanned aerial vehicle simulation model generation method based on cascade neural network |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109358634A (en) * | 2018-11-20 | 2019-02-19 | 南京航空航天大学 | A kind of hypersonic aircraft Robust Adaptive Control method |
JP2021034050A (en) * | 2019-08-21 | 2021-03-01 | 哈爾浜工程大学 | Auv action plan and operation control method based on reinforcement learning |
US11242134B1 (en) * | 2017-05-23 | 2022-02-08 | United States Of America As Represented By The Administrator Of Nasa | Real-time drag optimization control framework |
CN114637312A (en) * | 2022-03-18 | 2022-06-17 | 北京航空航天大学杭州创新研究院 | Unmanned aerial vehicle energy-saving flight control method and system based on intelligent deformation decision |
CN116400729A (en) * | 2023-04-10 | 2023-07-07 | 清华大学深圳国际研究生院 | Method for avoiding missiles by split airplane based on deep reinforcement learning and split airplane |
CN116560384A (en) * | 2023-03-21 | 2023-08-08 | 清华大学深圳国际研究生院 | Variant aircraft robust control method based on deep reinforcement learning |
-
2023
- 2023-08-10 CN CN202311003145.2A patent/CN116700020B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11242134B1 (en) * | 2017-05-23 | 2022-02-08 | United States Of America As Represented By The Administrator Of Nasa | Real-time drag optimization control framework |
CN109358634A (en) * | 2018-11-20 | 2019-02-19 | 南京航空航天大学 | A kind of hypersonic aircraft Robust Adaptive Control method |
JP2021034050A (en) * | 2019-08-21 | 2021-03-01 | 哈爾浜工程大学 | Auv action plan and operation control method based on reinforcement learning |
CN114637312A (en) * | 2022-03-18 | 2022-06-17 | 北京航空航天大学杭州创新研究院 | Unmanned aerial vehicle energy-saving flight control method and system based on intelligent deformation decision |
CN116560384A (en) * | 2023-03-21 | 2023-08-08 | 清华大学深圳国际研究生院 | Variant aircraft robust control method based on deep reinforcement learning |
CN116400729A (en) * | 2023-04-10 | 2023-07-07 | 清华大学深圳国际研究生院 | Method for avoiding missiles by split airplane based on deep reinforcement learning and split airplane |
Non-Patent Citations (3)
Title |
---|
变后掠翼飞机动态响应计算及参数分析;刘德华;;飞行力学(第04期);全文 * |
变形飞行器智能再入制导律研究;李勋;万方学位论文;全文 * |
深度强化学习在变体飞行器自主外形优化中的应用;温暖;刘正华;祝令谱;孙扬;;宇航学报(第11期);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN116700020A (en) | 2023-09-05 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN116700020B (en) | Control method and system for unmanned aerial vehicle with variable sweepback wings, unmanned aerial vehicle and storage medium | |
US11693373B2 (en) | Systems and methods for robust learning-based control during forward and landing flight under uncertain conditions | |
Abas et al. | Parameter identification of an autonomous quadrotor | |
CN108803639A (en) | A kind of quadrotor flight control method based on Backstepping | |
CN102809377A (en) | Aircraft inertia/pneumatic model integrated navigation method | |
Hoff et al. | Trajectory planning for a bat-like flapping wing robot | |
CN112558621A (en) | Decoupling control-based flying mechanical arm system | |
CN109683628B (en) | Spacecraft relative position control method based on finite time distributed speed observer | |
Lai et al. | Adaptive learning-based observer with dynamic inversion for the autonomous flight of an unmanned helicopter | |
EP3396487B1 (en) | Computer-implemented method and system for modelling performance of a fixed-wing aerial vehicle with six degrees of freedom | |
Stingu et al. | Design and implementation of a structured flight controller for a 6DoF quadrotor using quaternions | |
Lee et al. | Design of integrated navigation system using IMU and multiple ranges from in-flight rotating hexacopter system | |
Wierema | Design, implementation and flight test of indoor navigation and control system for a quadrotor UAV | |
Hua | Contributions to the automatic control of aerial vehicles | |
CN114180027A (en) | Control method and controller of morphing aircraft and application of controller | |
CN114779649B (en) | Four-rotor unmanned aerial vehicle suspension load transportation control method | |
Kemper et al. | Impact of center of gravity in quadrotor helicopter controller design | |
Baranek et al. | Model-based attitude estimation for multicopters | |
Müller et al. | Probabilistic velocity estimation for autonomous miniature airships using thermal air flow sensors | |
Kawamura et al. | Hierarchical mixture of experts for autonomous unmanned aerial vehicles utilizing thrust models and acoustics | |
Alsayed | Pitch and Altitude Control of an Unmanned Airship with Sliding Gondola | |
CN113359473B (en) | Microminiature unmanned helicopter nonlinear control method based on iterative learning | |
Grauer et al. | Development of a Sensor Suite for a Flapping-Wing UAV Platform | |
Erol et al. | Improvement of Filter Estimates Based on Data from Unmanned Underwater Vehicle with Machine Learning Insansiz Sualti Aracindan Alinan Verilere Baǧli Filtre Kestirimlerinin Makine Öǧrenmesi ile Iyileştirilmesi | |
Mitikiri | Rigid Body Attitude Estimation Using Inertial Sensors and Low-Level Visual Feedback |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |