Open AccessArticle

Neural Network Design and Training for Longitudinal Flight Control of a Tilt-Rotor Hybrid Vertical Takeoff and Landing Unmanned Aerial Vehicle

Guillaume Ducard

^1,2,*

and

Gregorio Carughi

Laboratoire d’Informatique, Signaux et Systèmes de Sophia Antipolis, Université Côte d’Azur, 06903 Nice, France

Institute for Dynamics, Systems and Control (IDSC), ETH Zürich, 8092 Zürich, Switzerland

Author to whom correspondence should be addressed.

Drones 2024, 8(12), 727; https://doi.org/10.3390/drones8120727

Submission received: 28 September 2024 / Revised: 29 October 2024 / Accepted: 18 November 2024 / Published: 2 December 2024

(This article belongs to the Special Issue AI-Assisted Control Strategies and Their Applications to the Stabilization, Guidance and Navigation of Drones)

Download

Browse Figures

Versions Notes

Abstract

This paper considers a hybrid vertical take-off and landing (VTOL) unmanned aerial vehicle (UAV). By tilting its propellers, the aircraft can transition from rotary-wing (RW) multirotor mode to fixed-wing (FW) mode and vice versa. A novel architecture of a neural network-based controller (NNC) is presented. An “imitative learning” approach is employed to train the NNC to mimic the response of an expert but computationally expensive model predictive controller (MPC). The resulting NNC approximates the MPC’s solution while significantly decreasing the computational cost. The NNC is trained on the longitudinal axis. Successful simulations and real flight tests prove that the NNC is suitable for the longitudinal axis control of a complex nonlinear system such as the tilt-rotor VTOL UAV through a sequence of transitions between the RW mode to the FW mode, and vice versa, in a forward flight.

Keywords:

machine learning; imitative learning; neural network-based control; unified flight control; tilt-rotor VTOL UAV; convertible VTOL UAV; hybrid VTOL UAV; longitudinal axis control

1. Introduction

1.1. Context

In the past few years, convertible vertical takeoff and landing (VTOL) UAVs have become popular because they can switch between a fixed-wing (FW) mode and a rotary-wing (RW) mode typical of multirotor helicopters. They thus combine the advantages of the two flight modes. In the FW mode, the wing-lift forces generated at high forward speeds enable fast and energy-efficient flights. In the RW mode, the vehicle can land and take off vertically, thus removing the need for a runway, and it can also hover and perform agile maneuvers on the spot.

This paper is based on the hybrid tilt-rotor VTOL UAV shown in Figure 1. This is an aircraft equipped with tilt mechanisms that can rotate pairs of propellers placed on wings. The goal is to keep the fuselage as horizontal as possible during the whole flight, including transitions during which the propellers are rotated between the vertical direction (RW mode) to the horizontal direction (FW mode), and vice versa. Such transition maneuvering is the most challenging part of the flight, especially if it is to be controlled by a single and unified control approach that (1) must be valid throughout the whole flight envelop, and (2) must accommodate two flight modes with drastically different properties and dynamics. As shown in Figure 2, the transition from RW to FW modes requires the tilt-rotor VTOL UAV to smoothly go through phases (a), (b), and (c). In phase (a), the vehicle pitches downward, using differential propeller thrust, to accelerate forward and to gain positive airspeed

V_{a}

. In phase (b), the propellers tilt forward, while at the same time, the fuselage pitches up and levels. In phase (c), a positive angle of attack allows some wing-lift force

F_{L}

to be created in order to compensate for the vehicle’s weight, while the total thrust T compensates for the drag force

F_{D}

in a level cruise flight.

1.2. Related Work

A comprehensive literature review about flight control for hybrid VTOL UAVs has been recently written by the main author of this paper and can be found in [2]. As summarized in Table 1, the most frequently adopted techniques for hybrid VTOL vehicles can be divided into two main families, namely (1) combined flight mode-dependent controllers, and (2) unified control approaches, consisting of a single control law applicable throughout the whole flight envelop.

1.2.1. Combined Flight Mode-Dependent Controllers

This class of controllers include techniques such as divide and conquer and control authority weighting, as listed in Table 1 and shown in Figure 3.

In the divide and conquer approach, a switching logic switches in a discrete manner between the different control laws, each tuned for a predefined operating point, such that only the appropriate controller is executed at a time.
In the control authority weighting approach, the output of two different controllers (one for the RW mode and one for the FW mode) are blended or fused continuously by applying a weight $γ (v (t))$ , itself dependent on a scheduling variable such as the aircraft airspeed $v (t)$ .

These control architectures usually provide a satisfactory performance close to the operating points that each controller has been tuned for. However, during the transitions among controllers, be it through a switching logic or authority weighting, the aircraft usually experiences a deterioration in reference tracking [19]. To overcome such issue, another class of control architectures have been developed that (a) removes the need for control switching or weighting, and (b) is valid through the whole flight envelop as a single control approach, usually referred as a “unified control approach”.

1.2.2. Unified Control Approaches

Unified control approaches employ a single controller that is capable of handling the different flight configurations without switching between several algorithms. As listed in Table 1, various approaches have been developed, where the prominent ones (a) continuously modify the controller behavior in a linear parameter varying (LPV) framework or (b) attempt to linearize the system online through a dynamic inversion process. More recently, nonlinear model predictive controllers (MPC) have been successfully applied to tilt-rotor VTOL UAVs. In particular, in [1], a nonlinear MPC controller was applied to the same hybrid tilt-rotor VTOL UAV as the one considered in this paper. The MPC loop delivers optimal reference tracking while respecting the system’s physical constraints.

1.2.3. Imitative Learning Approach

The imitative learning approach involves designing a student neural network controller trained by a teacher controller that generates control output signals given a collection of inputs. For example, the work in [50] considers a tail-sitter VTOL UAV controlled with a recurrent neural network (RNN) trained to mimic a nonlinear sequential convex programming (SCP) solver. Methods are also being developed to assess the stability of NN controllers that have been trained to mimic some teacher controllers [51].

1.3. The Control Approach of This Paper and Contributions of This Research

For the tilt-rotor hybrid VTOL UAV shown in Figure 1, two unified controllers have been designed using two different implementations of MPC flight control in [1,49]. The performance of such controllers is remarkable, although they only run at 20 Hz on dedicated companion computers: an Intel Up2 board in [49] and a Raspberry Pi 3B in [1], respectively, at their maximum processing load. What makes such MPC controllers so computationally expensive is their need to solve a nonlinear optimization problem at each sampling time. Thus, in this paper, the MPC controller is replaced by a neural network designed via an imitative learning approach. This means that the neural network is the student, trained to directly map the control input–output pairs generated by the teacher: the MPC controller. Once trained, the NN controller approximates the MPC controller with a significantly reduced computational cost.

To the best of our knowledge, this paper presents the first standalone NN control architecture trained with the imitative learning approach and successfully applied to a hybrid tilt-rotor VTOL UAV, shown in Figure 1, with real flight validation. The main practical motivations of this research are:

to replace a computationally expensive MPC controller that runs slowly onboard with a faster neural network-based controller that has been trained to imitate the MPC.
to provide a methodology to deploy a flight controller for hybrid VTOL UAVs that is unified, i.e., no need for gain scheduling, controllers switching, etc., as the UAV transitions from helicopter mode to airplane mode, and vice versa. This NN controller should be able to transition smoothly between these modes seamlessly, even if it has been trained from a “teacher” controller that itself uses gain scheduling, controllers switching, or computationally expensive controllers such as MPC.
if the dimension and physical properties of the UAV change, this has no impact on the NN architecture and learning procedure. Only the teacher controller needs to be adapted, and in turn, the NN controller needs to be retrained.

The main contributions of the paper are summarized as follows:

1.

Development of a novel NN-based controller that is capable of imitating a MPC controller in the longitudinal axis. This includes:

(a): A novel flight control architecture for tilt-rotor VTOL UAVs with two neural networks, namely the main NN, which is mostly responsible for attitude and altitude control, and the tilt NN, which is mostly responsible for controlling the tilt angle of the four tilting propellers.
(b): Construction of a dataset of input–output pairs generated with the expert MPC to train the neural networks.
(c): Standardized series of NN trainings to obtain the best NN architecture that delivers the smallest velocity tracking error in the longitudinal-vertical plane in simulation.

2.

Successful validation of the longitudinal flight control approach through simulations and real-world flight experiments.

This paper is further organized as follows:

Section 2 describes the hybrid tilt-rotor UAV considered in this paper, its dynamics equations and corresponding notations, and the conventions of this work.
Section 3, Section 4 and Section 5 present the MPC formulation [1], analyze the NN learning setup, and detail the “imitative learning” framework.
Section 6 discusses the simulation and real flight results and benefits of this imitative learning approach for a hybrid VTOL UAV.
Finally, Section 7 concludes with the limitations of the approach and possible future research work.

2. Aerial Vehicle Description

This paper considers the hybrid tilt-rotor VTOL UAV shown in Figure 1. Its nonlinear dynamics equations are based on the previous work in [52]. Table 2 presents the relevant parameters and coefficients, while Section 2.1 provides an overview of the notation adopted in this work. Section 2.2 presents the center-of-mass (CoM) dynamics equation considered throughout this paper. Section 2.3 describes the vehicle and the hardware setup.

2.1. Conventions and Nomenclature

2.1.1. Coordinate Frames

The inertial frame

I

is a North–East–Down (NED) frame, as shown in Figure 4. The body frame

B

is defined as a Forward–Right–Down (FRD) frame and is attached to the CoM of the VTOL UAV. Figure 5 shows that the tilt angle of the left propeller pair

χ_{L}

or of the right propeller pair

χ_{R}

is measured from the longitudinal direction

e_{x}

of the body frame, respectively. The origin

O_{R, i}

of each rotor frame

R_{i}

with

i = 1, \dots, 4

is placed at the corresponding tilting mechanism joint.

2.1.2. Notation

The right-hand superscript used for vectors identifies the frame in which the vectors are expressed, e.g.,

p^{I} \in R^{3}

represents the position vector of the CoM in the inertial frame, while

v^{B} \in R^{3}

identifies the velocity of the CoM in the body frame. The rotation matrix

R_{B}^{I} \in S O (3)

transforms the coordinates of a vector

v

from the body frame to the inertial frame according to

v^{I} = R_{B}^{I} v^{B}

. The rotation matrix

R_{B}^{I}

also serves to parameterize the attitude of the vehicle, i.e., the orientation of the aircraft body frame

B

with respect to the inertial frame

I

. Table 2 summarizes the notations and variables used in this research.

2.1.3. Actuators

This hybrid VTOL UAV is equipped with 11 actuators. There are five control surfaces, namely the left and right ailerons (

δ_{a, l}

δ_{a, r}

), the left and right rudders (

δ_{r, l}

δ_{r, r}

), and the elevator

δ_{e}

. There are four propellers with rotation speed

Ω_{i = 1, \dots, 4}

, respectively. They are grouped in two pairs, having the same tilt angles

χ_{3} = χ_{4} = χ_{l}

and

χ_{1} = χ_{2} = χ_{r}

, respectively. The actuators of the aircraft have the following constraints:

δ \in [- 30^{\circ}, + 30^{\circ}]

Ω_{i = 1, \dots, 4} \in [0, + Ω_{max}]

, and

χ_{l, r} \in [- \frac{π}{18}, + \frac{π}{2}]

rad. In particular, the tilt angles

χ_{l, r}

take continuous values in the range of

[- \frac{π}{18}, + \frac{π}{2}]

rad, allowing for the standard RW mode (

χ_{l, r} = 0

), the standard FW mode (

χ_{l, r} = \frac{π}{2}

), and a combination of the two modes, e.g., when

0 < χ_{i} < \frac{π}{2}

2.2. Center of Mass Dynamics

The 6 DoFs equations of motion of the aircraft’s CoM are derived from the Newton–Euler equations [52]:

\begin{matrix} {\dot{p}}^{I} & = v^{I}, \end{matrix}

(1a)

\begin{matrix} {\dot{v}}^{I} & = g e_{3} + \frac{1}{m} R_{B}^{I} (F_{a}^{B} + F_{r}^{B}), \end{matrix}

(1b)

\begin{matrix} {\dot{q}}_{B}^{I, n} & = \frac{1}{2} [\begin{matrix} 0 & - {(ω_{B / I}^{B})}^{⊤} \\ ω_{B / I}^{B} & - {[ω_{B / I}^{B}]}_{\times} \end{matrix}] q_{B}^{I, n}, \end{matrix}

(1c)

\begin{matrix} I^{B} {\dot{ω}}_{B / I}^{B} & = M_{r}^{B} + M_{δ}^{B} + M_{a}^{B} - {[ω_{B / I}^{B}]}_{\times} I^{B} ω_{B / I}^{B}, \end{matrix}

(1d)

where the aerodynamic torque vector generated by the control surfaces is

M_{δ}^{B}

, whereas the passive aerodynamics forces

F_{a}^{B}

and torques

M_{a}^{B}

are generated by the non-actuated components of the aircraft, such as the wings and the fuselage. In addition, the rotor forces

F_{r}^{B}

and the rotor moments

M_{r}^{B}

are dependent on the propeller rotational speeds

Ω_{1}, Ω_{2}, Ω_{3}, Ω_{4}

, and on the tilt angles

χ_{l}, χ_{r}

2.2.1. Rotor Forces and Moments

In the following, a simplification is introduced by having the two tilt mechanisms command the same tilt angle on both the left and right sides of the vehicle, implying that

χ_{l} = χ_{r} = χ

. The formulation in the body frame of the propellers’ forces

F_{r}^{B}

and torques

M_{r}^{B}

is obtained as follows:

\begin{matrix} F_{r}^{B} & = \sum_{i = 1}^{4} T_{i}^{B}, \end{matrix}

(2)

\begin{matrix} M_{r}^{B} & = \sum_{i = 1}^{4} Q_{r, i}^{B} + d_{r, i}^{B} \times T_{i}^{B}, \end{matrix}

(3)

where each rotor i thrust vector and resistive moment vector are given as follows, respectively:

\begin{matrix} T_{i}^{B} & = - C_{T} Ω_{i}^{2} [\begin{matrix} - s_{χ} \\ 0 \\ c_{χ} \end{matrix}], i = {1, 2, 3, 4} \end{matrix}

(4)

\begin{matrix} Q_{r, i}^{B} & = {(- 1)}^{i} C_{Q} Ω_{i}^{2} [\begin{matrix} - s_{χ} \\ 0 \\ c_{χ} \end{matrix}], i = {1, 2, 3, 4} \end{matrix}

(5)

with

c_{χ} = cos (χ)

and

s_{χ} = sin (χ)

. Furthermore,

C_{T}

and

C_{Q}

are the propeller thrust and moment coefficients, while

d_{r, i}^{B}

is defined as the vector between the CoM of the UAV and the point of application of each thrust force

T_{i}^{B}

\forall i

2.2.2. Aerodynamic Forces and Moments

In the following, the deflections of the control surfaces are regrouped as follows:

\begin{matrix} ailerons : δ_{a} & = δ_{a, l} = - δ_{a, r} \\ rudders : δ_{r} & = δ_{r, l} = δ_{r, r} \end{matrix}

with the subscripts l and r denoting “left” and “right”, respectively.

The actuated control surfaces

δ

generate the active aerodynamic torque

M_{δ}^{B}

. On the other hand, the aerodynamic force

F_{a}^{B}

and torque

M_{a}^{B}

are generated from passive components of the vehicle, i.e., right and left wings

W_{r}, W_{l}

, fuselage F, and horizontal and vertical sections of the tail

T_{h}, T_{v}

. Thus, these terms are defined as follows:

\begin{matrix} F_{a}^{B} & = \sum_{j \in {W_{r}, W_{l}, F, T_{h}, T_{v}}} F_{a, j}^{B}, \end{matrix}

(6)

\begin{matrix} M_{a}^{B} & = \sum_{j \in {W_{r}, W_{l}, F, T_{h}, T_{v}}} r_{j}^{B} \times F_{a, j}^{B}, \end{matrix}

(7)

with

r_{j}^{B}

being the vector between the CoM and the application point of

F_{a, j}^{B}

\forall j

The aerodynamic forces are expressed as follows:

F_{a, j}^{B} = \frac{ρ}{2} S_{j} | v_{a, j}^{B} | (C_{L, j} (α_{j}) v_{a, j}^{⊥, B} - C_{D, j} (α_{j}) v_{a, j}^{B}),

(8)

where the air density is

ρ

, the surface of the j-th component is

S_{j}

, and the air speed in the body frame is

v_{a, j}^{B}

v_{a, j}^{B} = v^{B} - R_{I}^{B} v_{w i n d}^{I} - r_{j}^{B} \times ω_{B / I}^{B} .

(9)

The subscript “⊥” appearing in (8) denotes the normal component of the airspeed vector associated with the lift force generated by the j-th component, with

j \in {W_{r}, W_{l}, F, T_{h}, T_{v}}

. Finally, the lift and drag coefficients are

C_{L, j} (α_{j})

C_{D, j} (α_{j})

, respectively, and depend on the local angle of attack (AoA)

α_{j}

2.2.3. Control Surface Aerodynamic Torques

The control surface aerodynamic torque vector is expressed as follows:

M_{δ}^{B} = \frac{ρ}{2} | v_{a}^{B} |^{2} [\begin{matrix} 2 C_{A} & 0 & 0 \\ 0 & C_{E} & 0 \\ 0 & 0 & 2 C_{R} \end{matrix}] [\begin{matrix} δ_{a} \\ δ_{e} \\ δ_{r} \end{matrix}],

(10)

where the airspeed

v_{a}^{B}

is computed with the formula in (9), assuming the same airspeed for all the control surfaces. The aerodynamic efficiency coefficients

C_{A}

C_{E}

, and

C_{R}

depend on the area of the control surfaces and are summarized in Table 3. The control surface aerodynamic torques

M_{δ}^{B}

depend on the square of the airspeed (

v_{a}^{2}

). Therefore,

at a low airspeed (RW mode), the aerodynamic torque vector $M_{δ}^{B}$ is considered negligible compared to the propeller-induced torque vector $M_{r}^{B}$ in (1d).
at a high airspeed (FW mode), the generation of torques via control surfaces is preferred over generating torques via differential propeller thrust.

2.3. Vehicle and Hardware Description

Table 3 summarizes the main characteristics of the vehicle shown in Figure 1. A Pixhawk 4 autopilot (https://px4.io/) is mounted on the UAV. It is a standard flight control unit (FCU) running the PX4 Firmware, a popular open-source framework. The Pixhawk 4 manages the communication between the actuators, the sensor suite, and the telemetry module, as well as the interface with the additional companion computer: an Intel UpBoard featuring an Intel Atom x5-Z8350 (4 × 1.44 GHz) processor, which runs the more computationally expensive algorithms such as MPC, the neural network-based controllers of this article, or additional guidance laws.

3. NN-Based Flight Controller Design Methodology

The design of the NN-based flight controller is carried out in three steps, namely:

Step1

design of an MPC-based teacher controller and practical validation on the real system, as reported in [1]. Figure 6 shows the control architecture for this hybrid VTOL UAV. It mainly consists of two main blocks:

the MPC controller (orange dashed rectangle), evaluated on an Intel UpBoard at a rate of 20 Hz. It outputs high-level commands that can be regarded as reference or feedforward terms that are further handled by:
the inner loop (blue dashed rectangle, running on a Pixhawk autopilot with a frequency of 250 Hz), which includes
1.
quaternion attitude controller: computes a corrective term for the torque,
2.
control allocation: block that calculates the surface deflections, the tilt angles, and the rotor speeds.

Step2

the MPC-based teacher controller generates several trajectories in a simulation, which are used in the next step to

Step3

train the NN-based flight controller presented in Section 5.

4. Step 1: Design of an MPC-Based Teacher Controller

The MPC algorithm is used as a baseline controller or “teacher controller” in the simulation to generate several trajectories that are useful for later training of the NN-based flight controller. The reasons for choosing an MPC controller are as follows:

it is able to accommodate the high nonlinearities of a VTOL aircraft, especially during transition maneuvers where the wing lift generation and control surface authority are varying nonlineary with airspeed.
it computes feasible trajectories, respecting actuator constraints and aerodynamics properties, to smoothly follow desired waypoints under the predictions provided by the Equations of Motions (EoM).

4.1. MPC State and Input Vectors and Constraints

The inner state of the MPC is defined as follows:

x_{m p c}^{p} = {[{(v_{m p c}^{I, p})}^{⊤}, χ_{m p c}^{p}, {(q_{B, m p c}^{I, p})}^{⊤}, {(ω_{B / I, m p c}^{B, p})}^{⊤}]}^{⊤} \in R^{11},

(11)

The inner control input computed by the MPC is as follows:

u_{m p c}^{p} = {[T_{m p c}^{p}, {\dot{χ}}_{m p c}^{p}, M_{m p c}^{B, p}]}^{⊤} \in R^{5},

(12)

with the following constraints:

\begin{matrix} T & \in [0, 4 c_{T} ω_{m a x}^{2}] \end{matrix}

(13a)

\begin{matrix} \dot{χ} & \in [- {\dot{χ}}_{m a x}, {\dot{χ}}_{m a x}] \end{matrix}

(13b)

\begin{matrix} M^{B} & \in [- M_{m a x}, M_{m a x}], \end{matrix}

(13c)

which must be satisfied at all optimization steps through the MPC horizon with running index

j \in {1, \dots, N}

, and where the maximum propeller speed is

ω_{m a x}

. The maximum torque that the MPC can request is

M_{m a x} = 2 Nm

, which still leaves some torque reserve so that the inner-loop attitude controller can compensate for fast disturbances in any flight conditions.

Regarding the contraints on the state vector

x_{m p c}^{p}

, only the tilt angle

χ_{m p c}

needs to be limited within the actual physical range,

χ_{m p c} \in [- \frac{π}{18}, \frac{π}{2}]

, which must be satisfied at all optimization steps

j \in {1, \dots, N + 1}

Note that:

the subscript “ $m p c$ ” distinguishes the inner-state vector $x_{m p c}$ of the MPC controller from the measured-state vector $x$ , which is defined as follows:

$x = {[{(v^{I})}^{⊤}, χ, {(q_{B}^{I})}^{⊤}, {(ω_{B / I}^{B})}^{⊤}]}^{⊤} \in R^{11} .$

(14)
the subscript “p” indicates that both $x_{m p c}^{p}$ and $u_{m p c}^{p}$ are the predicted state vector and control input vector over the whole prediction time horizon, respectively.
the rotor-tilt angle $χ$ is treated as a state element, whereas its rate $\dot{χ}$ is used as a control input. This allows for easier handling of the maximum propeller tilt rate ${\dot{χ}}_{m a x}$ (see Table 3), which limits the minimum duration of the transition maneuver [1].
the desired thrust $T_{m p c}^{p}$ is the magnitude of the sum of the four thrust vectors associated with each rotor, as defined in (4).
the commanded torque $M_{m p c}^{B, p} = {(M_{δ}^{B} + M_{r}^{B})}_{m p c}^{p}$ includes the components of (1d) that are directly controlled by the rotors and by the control surfaces of the VTOL UAV.

In addition, a vector including all the signals that are fed to the MPC controller is defined as follows:

X_{m p c}^{i n} = {[x^{⊤}, x_{r e f}^{⊤}, u_{r e f}^{⊤}]}^{⊤} \in R^{27},

(15)

where the reference state vector

x_{r e f}

and the reference control input vector

u_{r e f}

are generated by the user, as described in Section 4.3.

4.2. MPC Output Definition

As shown in Figure 6, the output signals of the MPC controller are chosen as follows:

X_{m p c}^{o u t} = {[T_{m p c}, χ_{m p c}, {(M_{m p c}^{B})}^{⊤}, {(q_{B, m p c}^{I})}^{⊤}]}^{⊤} \in R^{9} .

(16)

At each time step, the nonlinear MPC controller is evaluated with a solver generated from the ACADO Toolkit [53], which exploits the inherent real-time iteration (RTI) algorithm [1,54]. The ACADO solver solves the following MPC loop:

\begin{matrix} min_{x_{m p c}^{p}, u_{m p c}^{p}} \sum_{j = 1}^{N} l ({}^{j}{Δ x}_{m p c}^{p}, {}^{j}{Δ u}_{m p c}^{p}) + l_{f} ({}^{N + 1}{Δ x}_{m p c}^{p}), \\ s u b j e c t t o \\ {\dot{x}}_{m p c}^{p} = ϕ (x_{m p c}^{p}, u_{m p c}^{p}), \\ {}^{j}x_{m p c}^{p} \in [x_{m i n}, x_{m a x}], \forall j \in {1, \dots, N + 1}, \\ {}^{j}u_{m p c}^{p} \in [u_{m i n}, u_{m a x}], \forall j \in {1, \dots, N}, \\ {}^{1}x_{m p c}^{p} = x_{0}, \end{matrix}

(17)

where:

each stage cost l and the terminal cost $l_{f}$ are quadratic functions: $l = {}^{j}{Δ x}_{m p c}^{p, ⊤} Q^{j} Δ x_{m p c}^{p} + {}^{j}{Δ u}_{m p c}^{p, ⊤} R {}^{j}{Δ u}_{m p c}^{p}$ , $l_{f} = {}^{N + 1}{Δ x}_{m p c}^{p, ⊤} Q {}^{N + 1}{Δ x}_{m p c}^{p}$ with the positive definite weight matrices $Q \in R^{9 \times 9}$ and $R \in R^{5 \times 5}$ ,
the measured state vector at time step $t_{k}$ initializes the optimization loop $x_{0} = x (t_{k})$ ,
${}^{k}{Δ x}_{m p c}^{p}$ and ${}^{k}{Δ u}_{m p c}^{p}$ identify the difference at time step k between the states and control inputs $x_{m p c}^{p}$ and $u_{m p c}^{p}$ predicted by the MPC and the reference values $x_{r e f}$ and $u_{r e f}$ set by the user. Section 4.3 describes of how the reference signals $x_{r e f}$ and $u_{r e f}$ are generated.

The solver calculates the optimal feasible solution:

\{{}^{1}x_{m p c}^{p}, \dots, {}^{N + 1}x_{m p c}^{p}, {}^{1}u_{m p c}^{p}, \dots, {}^{N}u_{m p c}^{p}\} .

(18)

Finally, from the optimal solution, we extract the following variables, which are employed in the control architecture:

\begin{matrix} T_{m p c} & = {}^{1}u_{m p c, 1}^{p}, \end{matrix}

(19a)

\begin{matrix} χ_{m p c} & = {}^{2}x_{m p c, 4}^{p}, \end{matrix}

(19b)

\begin{matrix} M_{m p c}^{B} & = {[{}^{1}u_{m p c, 3}^{p}, {}^{1}u_{m p c, 4}^{p}, {}^{1}u_{m p c, 5}^{p}]}^{⊤}, \end{matrix}

(19c)

\begin{matrix} q_{IB, m p c} & = {[{}^{2}x_{m p c, 5}^{p}, {}^{2}x_{m p c, 6}^{p}, {}^{2}x_{m p c, 6}^{p}, {}^{2}x_{m p c, 7}^{p}, {}^{2}x_{m p c, 8}^{p}]}^{⊤}, \end{matrix}

(19d)

where, for the tilt angle

χ_{m p c}

and for the attitude quaternion

q_{B, m p c}^{I}

, the predicted states for the following time step are considered. The commanded control signals of the MPC controller are defined as follows:

X_{m p c}^{o u t} = {[T_{m p c}, χ_{m p c}, {(M_{m p c}^{B})}^{⊤}, {(q_{B, m p c}^{I})}^{⊤}]}^{⊤} \in R^{9},

(20)

which are then converted into the actuator commands

δ

Ω

, and

χ

by the control allocation algorithm described in Section 4.5. A thorough description of the output synthesis is provided in [1].

4.3. Reference Generation

Figure 6 shows that the reference state

x_{r e f}

, control input

u_{r e f}

, and measured state

x

are provided to the MPC controller. In practice, through a remote controller (RC), the user sends a reference for the longitudinal velocity

v_{l o n, r e f}

, the lateral velocity

v_{l a t, r e f}

, the vertical velocity

v_{v e r, r e f}

, and the yaw rate

{\dot{ψ}}_{r e f}

. Then, the reference velocity vector in the inertial frame

I

is automatically computed, and the reference yaw angle

ψ_{r e f}

is obtained by integrating the commanded yaw rate

{\dot{ψ}}_{r e f}

. Furthermore, the reference roll angle

ϕ_{r e f}

, roll rate

{\dot{ϕ}}_{r e f}

, pitch angle

θ_{r e f}

, and pitch rate

{\dot{θ}}_{r e f}

are set to zero. Finally, the reference control input

u_{r e f}

is set to zero to minimize the vehicle’s power consumption. In contrast, the propeller-tilt angle does not have any reference, since the controller can freely choose it based on the current status of the flight [1].

4.4. Quaternion Attitude Controller

The MPC controller is executed at a frequency of 20 Hz, which is the maximum frequency achievable on the onboard Raspberry Pi3B with ACADO solver [1]. This is too slow to track the fast attitude dynamics of the tilt-rotor UAV. Thus, a high-frequency PID attitude controller running at 250 Hz is added to the control scheme. Starting from the attitude

q_{IB, m p c}

predicted by the MPC and from the measured state

x

, the attitude controller computes the corrective feedforward term

M_{c o r r}^{B}

, shown in Figure 6. This algorithm is a modified version of an attitude controller designed in ref. [55].

4.5. Control Allocation

The desired thrust

T_{m p c}

, the desired tilt angle

χ_{m p c}

, and the desired total torque

M_{t o t}^{B} = (M_{c o r r}^{B} + M_{m p c}^{B})

are fed into the control allocation algorithm, which adopts a two-step daisy-chaining approach [1,56,57]. This algorithm maps the previously mentioned desired quantities to the actuator commands, namely propeller speeds

Ω

, control surface deflections

δ

, and front-rotor tilt angle

χ

5. Step 2: Imitative Learning Neural Network Controller

5.1. Motivation

The primary goal of this paper is to develop a neural network-based controller that is trained on a dataset generated by the MPC controller (the teacher) presented in Section 4, which is capable of generating high-level commands to transition this tilt-rotor VTOL UAV from helicopter mode to airplane mode, and vice versa. This “imitative learning” strategy consists of providing experts’ actions to a machine learning algorithm [58]. When the learning error becomes low enough, the trained neural network (a) entirely replaces the “expert” but slow MPC controller and (b) runs at a much higher pace directly in the UAV onboard computer.

5.2. Architecture of the NN-Based Flight Controller

Figure 7 shows the control architecture with the NN-based flight controller. Two separate neural networks are employed:

the first one, called themain NN, providing the commanded thrust vector $T_{n n s}^{I}$ and the commanded yaw angle $ψ_{n n s}$ ,
the second network, called the tilt NN, only outputs commands for the tilt angle: $χ_{n n s}$ .

Section 5.4 presents a thorough description and analysis of the two NNs, clarifying the choice of the inputs and outputs of both NNs.

5.2.1. Conversion Block

The thrust vector

T_{n n s}^{I}

computed by the main NN encodes the commanded roll angle

ϕ_{n n s}

and the commanded pitch

θ_{n n s}

angle of the UAV, but it does not encode the commanded yaw angle. Thus, the commanded yaw angle

ψ_{n n s}

is commanded separately by the main NN.

The thrust magnitude

T_{n n s}

and the unit thrust direction

η_{n n s}^{I}

are computed from the thrust vector

T_{n n s}^{I}

as follows:

\begin{matrix} T_{n n s} & = | | T_{n n s}^{I} | |, \end{matrix}

(21)

\begin{matrix} η_{n n s}^{I} & = \frac{T_{n n s}^{I}}{T_{n n s}} . \end{matrix}

(22)

The vertical upwards direction in the inertial frame is defined as

η_{u p}^{I} = [0, 0, - 1]

. The commanded rotation matrix from the body frame

B

to the inertial frame

I

for the yaw angle only is denoted as

{R_{B}^{I}}^{ψ n n s}

and writes as follows:

\begin{matrix} {R_{B}^{I}}^{ψ n n s} & = [\begin{matrix} c_{ψ_{n n s}} & - s_{ψ_{n n s}} & 0 \\ s_{ψ_{n n s}} & c_{ψ_{n n s}} & 0 \\ 0 & 0 & 1 \end{matrix}], \end{matrix}

(23)

whereas the rotation matrix between the rotor frame

R

and body frame

B

is denoted as

{R_{R}^{B}}^{χ_{n n s}}

, where the commanded propeller-tilt angle around the

y -

body axis is denoted as

χ_{n n s}

\begin{matrix} {R_{R}^{B}}^{χ_{n n s}} & = [\begin{matrix} c_{χ_{n n s}} & 0 & - s_{χ_{n n s}} \\ 0 & 1 & 0 \\ s_{χ_{n n s}} & 0 & c_{χ_{n n s}} \end{matrix}] . \end{matrix}

(24)

The commanded rotation matrix

{R_{R}^{I}}^{{ϕ θ}_{n n s}}

transforms the coordinates of a vector from the rotor frame to the inertial frame through the commanded roll angle

ϕ_{n n s}

and pitch angle

θ_{n n s}

only. It can be calculated using the following version of the Rodrigues’ rotation formula between the two unit vectors

η_{n n s}^{I}

and

η_{u p}^{I}

{R_{R}^{I}}^{{ϕ θ}_{n n s}} = I_{3} + {[ξ]}_{\times} + {[ξ]}_{\times}^{2} \frac{1 - c}{s^{2}},

(25)

with

the vector $ξ$ perpendicular to the plane spanned by the vectors $η_{u p}^{I}$ and $η_{n n s}^{I}$ . This defines the rotation axis about which to turn to bring vector $η_{u p}^{I}$ and $η_{n n s}^{I}$ :

$\begin{matrix} ξ & = η_{u p}^{I} \times η_{n n s}^{I}, \end{matrix}$

(26)
The sine s and the cosine c of the angle between the direction $η_{u p}^{I}$ and the commanded direction $η_{n n s}^{I}$ are calculated as follows:

$\begin{matrix} s & = | | ξ | |, \end{matrix}$

(27)

$\begin{matrix} c & = η_{u p}^{I} \cdot η_{n n s}^{I}, \end{matrix}$

(28)

Finally, the commanded rotation matrix

{R_{I}^{B}}_{n n s}

between the inertial and the body frame is computed using Equations (23)–(25) with the following steps:

1.: first, compute the commanded rotation matrix for the roll and pitch angles in the body frame $B$ as follows:

${R_{I}^{B}}^{{ϕ θ}_{n n s}} = {R_{R}^{B}}^{χ_{n n s}} {R_{I}^{R}}^{{ϕ θ}_{n n s}},$

(29)
2.: then, combine the two rotation matrices describing the rotation from the inertial frame $I$ to the body frame $B$ :

$\begin{matrix} {R_{I}^{B}}_{n n s} & = {R_{I}^{B}}^{{ϕ θ}_{n n s}} {R_{I}^{B}}^{ψ_{n n s}}, \end{matrix}$

(30)

$\begin{matrix} = \underset{{R_{I}^{B}}^{{ϕ θ}_{n n s}}}{\underset{︸}{[\begin{matrix} 1 & 0 & 0 \\ 0 & c ϕ_{n n s} & s ϕ_{n n s} \\ 0 & - s ϕ_{n n s} & c ϕ_{n n s} \end{matrix}] [\begin{matrix} c θ_{n n s} & 0 & - s θ_{n n s} \\ 0 & 1 & 0 \\ s θ_{n n s} & 0 & c θ_{n n s} \end{matrix}]}} [\begin{matrix} c ψ_{n n s} & s ψ_{n n s} & 0 \\ - s ψ_{n n s} & c ψ_{n n s} & 0 \\ 0 & 0 & 1 \end{matrix}] \end{matrix}$

(31)
3.: from which the commanded attitude quaternion ${q_{I}^{B}}_{n n s}$ is calculated.

The overall NN-based controller is highlighted with the orange rectangle in Figure 7. It corresponds to the replacement of the MPC controller (orange box in Figure 6) introduced in Section 4, where the input and the output signals of the orange boxes are the same in both architectures. Finally, the quaternion attitude controller and the control allocation blocks are the same as in Section 4.4 and Section 4.5.

5.2.2. Thrust Vector Attitude Controller

The thrust vector attitude controller is designed to steer the measured thrust direction

η^{I}

towards the desired NN-commanded thrust vector

η_{n n s}^{I}

. It outputs the desired torque

M_{n n s}^{B}

in the body frame. This algorithm was developed in [59], and it is here adapted to a tilt-rotor VTOL UAV via the following steps:

1.: Compute the first two components of the commanded angular velocity ${ω_{B / I}}_{n n s, 1, 2}^{B}$ with:

${ω_{B / I}}_{n n s, 1, 2}^{B} = {(k_{η} R_{I}^{B} \frac{η^{I} \times η_{n n s}^{I}}{{(1 + {(η^{I})}^{⊤} \cdot η_{n n s}^{I})}^{2}})}_{1, 2},$

(32)

where $k_{η}$ is a positive gain, and where the measured thrust vector direction $η^{I}$ is easily computed from the measured attitude matrix $R_{IB}$ and from the measured tilt angle $χ$ .
2.: Since the thrust direction only involves the roll and the pitch angles of the UAV, the yaw component of the commanded angular velocity ${ω_{B / I}}_{n n s, 3}^{B}$ is defined with a simple P-controller:

${ω_{B / I}}_{n n s, 3}^{B} = k_{ψ} (ψ_{n n s} - ψ) .$

(33)

The works in [59] provide a complete proof.
3.: Then, the desired torque vector $M_{n n s}^{B}$ is computed from (1d) [60,61], yielding the following:

$\begin{matrix} M_{n n s}^{B} & = {(M_{δ}^{B} + M_{r}^{B})}_{n n s} \\ = - I^{B} K_{ω} (ω_{B / I}^{B} - ω_{B / I, n n s}^{B}) + {[ω_{B / I, n n s}^{B}]}_{\times} I^{B} ω_{B / I}^{B} + I^{B} {\dot{ω}}_{B / I, n n s}^{B} - M_{a}^{B}, \end{matrix}$

(34)

where the gain $K_{ω}$ is a diagonal positive definite matrix.

Remark: In a first attempt, the main NN was trained to mimic the MPC-commanded torque

M_{m p c}^{B}

in directly generating the commanded torque

M_{n n s}^{B}

. However, this was not successful. We believe that the reason is that the torque commanded by the MPC (

M_{m p c}^{B}

) is too noisy for adequate NN training, as artificial noise is introduced in the simulations on both sensor and actuator signals, as explained in Section 5.3. However, it has been observed that the NNs accurately learned the commanded Euler angles

Υ_{IB, m p c}

and commanded tilt angle

χ_{m p c}

, which are generally smoother than the MPC torque

M_{m p c}^{B}

. Therefore, a successful alternative solution has been to introduce the conversion block of the thrust vector attitude controller into the NN controller formulation, as shown in Figure 7, which takes the NN-generated commands in terms of

Υ_{IB, n n s}

and the tilt angle

χ_{n n s}

as inputs.

5.2.3. Thrust Correction Block

The role of this block is to refine the value of the commanded thrust

T_{n n s}

to minimize the velocity tracking error. According to our experience, this block turns out to be useful when the main NN controller does not imitate the MPC controller perfectly, and when some residual velocity tracking errors remain. Such velocity errors are defined as

v_{e}^{I} = v_{r e f}^{I} - v^{I}

. Consequently, a corrective inertial acceleration can be calculated as follows:

{\dot{v}}_{c o r r}^{I} = k_{v} v_{e}^{I},

(35)

where

k_{v}

is a positive proportional gain tuned in the simulation. Such corrective acceleration can be generated by the rotor forces as follows:

{\dot{v}}_{c o r r}^{I} = \frac{1}{m} R_{IB} F_{r, c o r r}^{B} .

(36)

By combining (35) with (36), the following formulation for the corrective rotors forces

F_{r, c o r r}^{B}

is obtained:

F_{r, c o r r}^{B} = m k_{v} R_{IB}^{⊤} v_{e}^{I} .

(37)

Finally, the corrective thrust term

T_{c o r r}

is defined as the component of

F_{r}^{B}

in the current direction of the propellers, with the addition of a minus sign due to the convention of the z-axis of the body frame:

T_{c o r r} = - {(R_{BR}^{⊤} F_{r, c o r r}^{B})}_{z},

(38)

where

R_{BR}

represents a rotation of angle

χ

around the y-axis of the body frame. The total thrust magnitude is obtained as

T_{n n s, t o t} = T_{n n s} + T_{c o r r} .

(39)

Remark: the thrust direction is not modified; it remains

η_{n n s}^{I}

, generated by the main NN controller. Only the amplitude of the thrust is corrected by the term

T_{c o r r}

5.3. Data Generation

Data generation represents a crucial step in any machine learning application, as thorough state space coverage is required for the generated dataset. Since the neural networks are trained on the input–output pairs extracted from simulated trajectories, the objective is to include a wide range of possible flight paths in the dataset so that the neural networks could learn how to react to a broad variety of scenarios during a flight mission. For this reason, when collecting the data for the training, Gaussian noise is introduced in the feedback loop of the control architecture shown in Figure 6. At each time step, a random action affecting the actuators commands

{[δ^{⊤}, χ, Ω^{⊤}]}^{⊤}

is inserted with a probability of 30% [62].

Specifically, in

{MATLAB}^{®}

, we carry out one thousand simulations with the MPC controller architecture described in Section 4, and different classes of trajectories are considered. Also, a random reference velocity in the north and down directions of the inertial frame

I

is set. Figure 8 presents an example of the simulations carried out during the data generation.

5.3.1. Limitations

When generating preliminary data, the reference trajectories in the east direction were also included for training, but we observed a higher validation error. The feedforward fully connected NN architecture employed in this research was unable to effectively learn the steering of the VTOL UAV in complex three-dimensional orientations and trajectories. Thus, only trajectories in the north–down plane will be considered, whereas extensions to three-dimensional flight paths will be investigated in a future work.

5.3.2. Chosen Trajectories for Training

In this paper, the reference trajectories used for NN training are chosen as follows:

Only trajectories in the vertical north–down plane are considered.
As explained in Section 4.3, the reference attitude $q_{B, r e f}^{I}$ and the reference angular velocity $ω_{B / I, r e f}^{B}$ are constant. For this reason, the only informative components of $X^{i n}$ in (15) are the measured state $x$ and the reference velocity $v_{r e f}^{I}$ . Thus, the expression of the signals fed to the MPC in (15) can be simplified and reformulated as follows:

$X^{i n} = {[x^{⊤}, {(v_{r e f}^{I})}^{⊤}]}^{⊤} \in R^{14} .$

(40)
For each simulation, the input–output pair of the MPC controller is collected at each time step, as defined in (40) and (16). As a result, a dataset with a total of 8,458,000 input–output pairs ${X^{i n}, X_{m p c}^{o u t}}$ is generated. Then, as described in Section 5.3.4, these MPC pairs are manipulated to obtain signals that are compatible with the input–output definition of the two neural networks deployed in the NN-based controller shown in Figure 7, namely the tilt NN and main NN.

5.3.3. Data Preprocessing: Smoothing

The MPC data input–output pairs are smoothed before being used for NN training, as follows:

The Savitzky–Golay filtering method is employed on a temporal sequence of $2 m + 1$ time steps and fits the signals with polynomial degree d in the least square sense [63].
The filter is applied over all of the simulations included in the dataset, and the parameters are set empirically to $d = 3$ and $m = 41$ .

5.3.4. Data Preprocessing: NN Dataset Definition

Starting from the dataset with pairs

{X^{i n}

X_{m p c}^{o u t}}

, two distinct datasets are extracted to train the two neural networks as follows:

The dataset for training the main NN has the following form for the input–output pair for each time step of the simulations:

$\begin{matrix} X^{i n_{m a i n}} & = {[x^{⊤}, {(v_{r e f}^{I})}^{⊤}]}^{⊤} \in R^{14}, \end{matrix}$

(41)

$\begin{matrix} X_{m p c}^{o u t_{m a i n}} & = {[T_{m p c}, {(Υ_{IB, m p c})}^{⊤}]}^{⊤} \in R^{4}, \end{matrix}$

(42)

where the commanded Euler angles $Υ_{IB, m p c}$ are directly calculated a posteriori starting from the desired attitude quaternion $q_{B, m p c}^{I}$ .
The dataset for training the tilt NN is defined with the following input–output pair:

$\begin{matrix} X^{i n_{t i l t}} & = {[{(v^{I})}^{⊤}, θ, {(v_{r e f}^{I})}^{⊤}]}^{⊤} \in R^{7}, \end{matrix}$

(43)

$\begin{matrix} X_{m p c}^{o u t_{t i l t}} & = [χ_{m p c}] \in R^{1} . \end{matrix}$

(44)

It is worth noting that in the input vector $X^{i n_{t i l t}}$ , instead of considering the full attitude, only the measured pitch angle $θ$ is considered, which is directly coupled with the tilt angle when controlling the thrust direction. Moreover, for simplicity, the measured angular velocity $ω_{B / I}^{B}$ is also not considered.
Scaling and normalizing the datasets:
the inputs and outputs of the datasets of both neural networks are scaled and normalized, which is a common procedure in most machine learning applications in order to assign equal importance to all components and signals. In particular, the inputs and targets of the two above datasets are scaled so that all the components lie in the interval $[0, 1]$ .
Output layer activation function:
the output layer of both neural networks features a tanh activation function. Thus, the control policies for each time step of both datasets are further scaled so that the targets lie in the interval $[- 1, 1]$ , consistent with the output of the tanh function.

5.4. Neural Network-Based Control Architecture

As visible in Figure 7, the control system is designed around two distinct neural networks for the following reasons: the MPC controller provides the tilt angle

χ_{m p c}

predicted for the following time step as an output, as visible in the definition of

X_{m p c}^{o u t}

in (16). However, the current measured tilt angle

χ

appears in the input

X^{i n}

of (15). As the dynamics of the tilt angle are usually slow, the values of the tilt appearing in the output

X_{m p c}^{o u t}

are typically just slightly different from the ones included in the input

X^{i n}

. Thus, the training of the neural networks cannot effectively learn the evolution in time of the tilt angle. For this reason, directly learning the mapping between the full input

X^{i n}

and the full MPC output

X_{m p c}^{o u t}

was not successful. Thus, we introduced a second neural network (tilt NN) that only learns the commanded tilt angle

χ_{m p c}

, where the measured tilt angle is excluded from the input

X^{i n_{t i l t}}

of the dataset.

5.5. Neural Network Layout and Training

A standardized investigation is carried out to obtain the best NN layout with the smallest root mean squared error (RMSE) on the test set. RMSE is defined as follows:

R M S E = \sqrt{\frac{\sum_{i = 1}^{n} {(y_{p r e d, i} - y_{a c t, i})}^{2}}{n}} .

(45)

5.5.1. Feedforward vs. Recursive NN Layout

The goal of this research is to replace the expert MPC controller with a less computationally expensive NN-based controller. Thus, the inference time of the neural networks must be minimized.

In preliminary tests, we considered recurrent neural networks (RNNs), where the temporal evolution of the system is taken into account by computing the commanded output based on a series of inputs from several previous time steps and not only on the latest available input. A popular scheme of RNNs is the long short-term memory (LSTM) structure, and the literature presents many examples of its applications in the imitative learning approach for UAVs [50]. However, in these preliminary tests, the inference time of an RNN was about 70% more than the inference time of a corresponding FF network with the same number of neurons and layers. Furthermore, there was no substantial reduction in the velocity tracking error when preferring an LSTM architecture over an FF scheme.

Moreover, the literature also presents several successful applications of the imitative learning approach to exploit feedforward architectures [64,65]. FF neural networks do not consider the time evolution, but this deficiency can reasonably be compensated with a large and thorough enough dataset [66].

For these reasons, this paper focuses on a simple NN architecture, e.g., feedforward fully connected neural nets with the smallest possible number of layers and neurons, as described in Section 5.5.3.

5.5.2. Choice of Activation Functions

In this work, FF fully connected NNs are employed, where rectified linear unit (ReLu) activation functions are used for all the layers except the output layer, which is a hyperbolic tangent (tanh) activation function. The ReLu activation function is less computationally expensive than the tanh and sigmoid functions, as its evaluation involves simpler algebraic operations. Also, we decide to deploy a tanh function in the output layer only, since it strongly distinguishes between outputs with a negative sign and outputs with a positive sign.

5.5.3. Training Implementations

In this work, the weights of the neural network are initialized using the Xavier method, whereas the Adam optimizer in its standard configuration is used with a minibatch size of 128 and

L_{2}

regularization with

λ = 0.0001

. The two training datasets are randomly partitioned with the standard 80%–10%–10% division for the training, validation, and test sets. We employ the early stopping technique to prevent overfitting. It has been observed that the validation RMSE usually steadily decreases in the first few epochs, then it reaches a plateau from which the validation RMSE does not appreciably increase. Thus, this suggests that the training is not overfitting the training data set and that the training is still in the underfitting regime, indicating that the performance of the neural network training can be further improved [66].

The training was performed in

{MATLAB}^{®}

with a GPU for parallelized and faster training. Specifically, a NVIDIA GeForce GTX 1650 Ti with Max-Q design was employed. In the first set of training, the learning rate was kept constant with a goal to optimize the NN layout. NN architectures with one, two, three, or four inner layers were considered, where, for each layer, 16, 32, 64, 128, 256, or 512 neurons are evaluated. At first, training with 5 epochs was carried out.

First Training Series

Table 4 presents the results of the first series training (5 epochs) for the main NN and the tilt NN. In particular, the first column of both tables indicates the architecture adopted for a specific training. The architectures with the smallest RMSE among the schemes with the same number of inner layers are highlighted in green.

For both the main NN and tilt NN shown in Figure 7, the RMSE on the test set decreases more significantly when increasing the number of inner layers, while an increase in the number of neurons does not necessarily imply a reduction in the approximation error. In addition, a subsequent decrease in the number of neurons in the inner layers usually delivers a reduced RMSE.

Scond Training Series

Subsequently, a second series of training was carried out. The learning rate was optimized for the architectures with the best RMSE score among the ones with the same number of layers. In particular, a first iteration was conducted in which the learning rate was reduced by a factor of 2. Then, another iteration was performed in which the learning rate was further reduced by a factor of 5. Furthermore, in this second series of training, the number of epochs was increased from 5 to 15 to allow the training steps with reduced learning rates to reach the validation RMSE plateau. Table 5 shows the results of the described second series of training for the main NN and tilt NN, respectively.

In general, the smallest values of the RMSE on the test set are obtained with the four-inner-layer architectures, suggesting that the approximation error could be further reduced with an increase in the number of layers. However, the primary goal of this work is to implement a NN-based controller with reduced computational complexity compared to the MPC-based controller. Thus, the investigation is limited to a maximum of four inner layers, also because the literature presents multiple successful deployments of the imitative learning approach when adopting similar architectures in terms of number of layers and neurons [62,67].

In Section 7, the architectures with the smallest RMSE obtained via this second series of training are evaluated. We use simulations to compare the tracking properties of these neural networks with respect to the performance of the expert MPC controller.

Finally, the training time presented in Table 4 and Table 5 highly depends on the CPU usage, which is influenced by the other programs running on the laptop at the same time. This explains the fluctuations in the training time witnessed in this section.

5.6. NN-Based Controller Output Definition

The main NN is trained on a dataset with input–output pairs defined as

{X^{i n_{m a i n}}

X_{m p c}^{o u t_{m a i n}}}

, while the tilt NN is trained on the pairs

{X^{i n_{t i l t}}

X_{m p c}^{o u t_{t i l t}}}

, as described in Section 5.3. Once the two neural networks are trained, they are deployed in the control architecture of Figure 7 and evaluated at each time step, providing the NN outputs

X_{n n s}^{o u t_{m a i n}}

and

X_{n n s}^{o u t_{t i l t}}

, defined as follows:

\begin{matrix} X_{n n s}^{o u t_{m a i n}} & = {[T_{n n s}, {(Υ_{IB, n n s})}^{⊤}]}^{⊤} \in R^{4}, \end{matrix}

(46)

\begin{matrix} X_{n n s}^{o u t_{t i l t}} & = [χ_{n n s}] \in R^{1} . \end{matrix}

(47)

These NN outputs are approximations of the commands of the expert MPC controller presented in (42) and in (44), respectively. Then, the overall NN-based controller features the additional control loops presented in the previous subsections, as also visible in Figure 7. Thus, the overall output of the NN controller is defined as follows:

X_{n n s}^{o u t} = {[T_{n n s, t o t}, χ_{n n s}, {(M_{n n s}^{B})}^{⊤}, {(q_{B, n n s}^{I})}^{⊤}]}^{⊤} \in R^{9} .

(48)

The control signals provided by the NN-based controller of (48) are the same as the ones generated by the MPC loop in (16).

6. Results

This section starts from the NN architectures in Section 5 with the smallest RMSE on the test set. They are deployed in the NN-based controller of Figure 7. Specifically, in Section 6.1,

{MATLAB}^{®}

simulations are carried out to compare the overall NN-based controller’s tracking performance with respect to the expert MPC’s performance. Then, for the main NN and for the tilt NN, the NN architecture providing the lowest tracking error in the

{MATLAB}^{®}

simulations is selected, respectively. These simulation results are presented in Section 6.7, focusing on the approximation errors introduced by the imitative learning approach with the neural networks. Finally, Section 6.8 describes the experimental results of the NN-based controller in detail.

6.1. Tracking Performance and Control Architecture

In Section 5.4, two series of training for both the main NN and the tilt NN are carried out. In total, four training steps for both neural networks are selected. Specifically, 16 different

{MATLAB}^{®}

simulations are performed, in which the selected main and tilt NN schemes are employed with every possible combination. Thus, architectures with a slightly larger RMSE are also considered and tested. In fact, a smaller RMSE on the test set does not guarantee more effective tracking of the reference velocity trajectory [66]. This is because the propagation of the approximation errors through the control architecture might overcompensate for a slight reduction in the RMSE of the neural networks.

Furthermore, since only feedforward neural networks are considered in this study, the networks do not explicitly learn the time-transitory behavior of the MPC controller, making the imitation imperfect. For this reason, the architectures providing the smallest approximation errors on the test set do not necessarily imply better tracking of a trajectory, as visible in the following Section.

6.2. Controller Execution Frequencies

Table 6 shows the parameters adopted in all the simulations and experiments presented in this Section. The MPC controller runs with a frequency of 20 Hz, while the NN-based controller is evaluated at 50 Hz. These values were determined during preliminary tests while running the two algorithms on the UAV companion computer. Section 6.8.2 provides more details regarding the computational cost of the two control algorithms, justifying the choice of the control frequencies adopted in this Section. Finally, the quaternion attitude controller and the control allocation run at 250 Hz to mimic the implementation on the real system.

6.3. Performance Indicator Definition

First of all, the tracking capabilities of the algorithms are evaluated via a performance indicator, defined as the Mean Absolute Velocity Tracking Error (MAVTE):

MAVTE = \frac{1}{T} \int_{0}^{T} | | v_{r e f} - v {| |}_{1} d t,

(49)

where the constant T is the simulation length in seconds,

v

is the measured velocity of the system,

v_{r e f}

is the reference velocity, and

{| | \cdot | |}_{1}

identifies the

L^{1}

norm of a vector.

6.4. Test Trajectory Definition

The imitative learning algorithm is tested using a

{MATLAB}^{®}

simulation on a velocity ramp trajectory. This type of trajectory is not included in the dataset created for NN training in Section 5.3. Indeed, the goal is to verify the generalization property of the trained neural networks, i.e., to check whether the trained NN-based controller can accommodate trajectories for which it has not been specifically trained. This velocity ramp trajectory commands a transition maneuver from the RW mode to the FW mode, and vice versa.

6.5. Performance Indicator of the MPC Controller on the Test Trajectory

Figure 9 shows the velocity tracking performance of the MPC controller. Similar to the data generation procedure, Gaussian noise is introduced in the feedback loop, and at each time step, a random noise is also applied to the actuator commands with a noise level equal to 30% of the actuator command [62]. The MPC controller delivers a value of the performance indicator equal to

MAVTE = 0.69

m/s.

6.6. Performance Indicator of the NN-Based Controller on the Test Trajectory

Table 7 presents the simulated velocity tracking error values when the trained NN-based controller is tasked to track the velocity ramp trajectory.

It is noticeable that a few NN-based controller architectures deliver a velocity tracking error that is smaller than the one obtained with the MPC controller. This is mainly due to three factors:

The MPC controller adopts a simplified dynamics model in the optimization loop. Consequently, there is a mismatch between the predicted trajectories and the actual evolution of the system. Thus, a velocity tracking error follows.
The NN-based controller is evaluated with a frequency of 50 Hz, which is higher than the MPC controller frequency, i.e., 20 Hz. Thus, the higher frequency provides a more effective disturbance rejection, although the NN-based controller only approximates the teacher MPC controller.
The velocity controller of Section 5.2.3 that is introduced inside the NN-based controller (Figure 7) provides a corrective term for the commanded thrust $T_{c o r r}$ , which helps to reduce the velocity tracking error to some extent and helps to compensate for learning imperfections.

6.6.1. Discussion About the NN Architectures

Single-Layer NN Architectures

The NN-based controllers featuring a single layer with 512 neurons, labeled the [512] architecture for the main NN (see left column in Table 7) provide the highest tracking error on the velocity ramp trajectory for all the tilt NN architectures. If all the [512] architectures are excluded for both the main NN and tilt NN, all the other NN-based controller schemes in Table 7 present values of the tracking error for the velocity ramp trajectory contained in the interval NN-MAVTE

\in [0.67, 0.92]

m/s compared to the value MPC-MAVTE

= 0.69

m/s obtained with the MPC controller. As a result, the single-layer architecture with 512 neurons will be discarded.

Multiple-Layer NN Architectures

The NN-based controller with the smallest tracking error on the velocity ramp trajectory is the one with the [128]-[128]-[128] architecture for the main NN and [128]-[64]-[32]-[16] for the tilt NN with an NN-MAVTE = 0.67, as highlighted in red in Table 7. This value is even smaller than the one obtained with the MPC controller by

2.9

On the other hand, the NN-based controller with the highest tracking error on the velocity ramp trajectory is the one with the [256]-[128] architecture for the main NN and [256]-[128] for the tilt NN. Its corresponding NN-MAVTE value is 0.92, which is 19 % more than the MPC-MAVTE value.

Conclusions

The neural network architectures with only one inner layer do not provide satisfactory results, and they perform significantly worse than all the other architectures. They are thus not considered in the rest of this research.
In contrast, the NN-based controllers with two inner layers have a performance comparable to the MPC one, and in some cases, they display a velocity tracking error (MAVTE) smaller than the MPC one.
The neural network architectures with the smallest RMSE values on the training test set do not necessarily deliver the best tracking performance. For instance, the combination of the [128]-[64]-[32]-[16] tilt NN and the [128]-[128]-[128] main NN, having an $R M S E = 0.03643$ , provides a smaller MAVTE on the velocity ramp trajectory compared to the [128]-[128]-[128]-[128] main NN (with a smaller $R M S E$ of $0.03272$ ).
Figure 10 and Figure 11 show the main NN and the tilt NN architectures delivering the smallest tracking error (MAVTE) on the velocity ramp trajectory. This is the architecture that will be used for the rest of this work.

6.6.2. Discussion of the Impact of Changes in the UAV’s Physical Dimensions on the NN-Based Controllers

The physical characteristics and dimensions of the aircraft are provided in Table 3. The architecture and learning rate of the neural network controllers (NNC) are not dependent on the size, mass, inertia, aerodynamics, etc., of the UAV, but the weights of the NNCs are. Changing the physical properties of the UAV modifies the aircraft’s model, which is used in the MPC prediction over the optimization horizon. Thus, this modifies the MPC controller’s outputs for the same desired trajectory. Since the NNC learns from the input–output pair of the MPC controller, the weights of the NNC will also be modified. Therefore, if the physical properties of the aircraft change, the model used in the MPC controller must be adapted accordingly to generate new MPC input/output pair signals, which will serve to retrain the NNC. In this way, the NNC will produce control signals allowing the UAV to track the desired state trajectories despite the change in the physical properties of the aircraft.

6.7. Simulation Results

6.7.1. Context and Expected Results

The objective is to first verify the ability of the trained NN-based flight controller to steer the hybrid VTOL UAV shown in Figure 1 through simulations, and to perform transitions from the RW mode to the FW mode and back. For this, a trajectory is designed with a constant acceleration up to 15 m/s (identified in cyan in Figure 12 and Figure 13), followed by a deceleration region (highlighted in yellow). During the acceleration phase, two specific behaviors are expected:

As the airspeed increases, the lift forces generated by the wings also increase. Thus, the controller commands a smooth forward rotation of the tilt mechanisms to track the reference velocity by generating a component of the thrust in the longitudinal direction. In the meantime, the pitch angle stabilizes around 0 deg, guaranteeing an appropriate angle of attack (AoA) for the wings.
The thrust produced by the propellers decreases over time, as the wings are already supplying the required lift force, and the thrust is (almost) only required to accelerate in the longitudinal direction.

On the other hand, during the deceleration phase, the following flight characteristics are expected:

The vehicle pitches up to augment the aircraft surface exposed to the air flow, thus increasing the drag forces acting on the aircraft.
The tilt angle decreases, reaching a negative value, and the thrust goes to zero to interrupt the forward motion.

6.7.2. Comparison Between MPC and NN-Based Controller

Figure 12 shows the velocity tracking properties of the NN-based and MPC controllers over the same velocity ramp trajectory in the

{MATLAB}^{®}

simulations. In the north direction, the NN-based controller delivers a slightly smaller error than the MPC. In the east direction, the MPC response oscillates, while the NN-based controller delivers a smaller error due to the higher controller frequency. Finally, the MPC controller provides a smaller tracking error in the vertical direction.

Figure 13 shows the measured pitch angle

θ

, the commanded tilt angle

χ_{c m d}

, and the commanded thrust

T_{c m d}

for both the MPC and the NN-based controllers. The UAV pitches up in the deceleration phase in both simulations. The propeller tilt angle

χ

reaches a negative value, and the thrust magnitude goes to zero.

The overall shape of the NN-based controller signals is comparable to the ones of the MPC. There are minor differences that are most likely due to two main factors:

The velocity ramp trajectory type was not included in the data generation procedure for NN training.
Contrary to recurrent neural networks, the feedforward neural networks employed in this work do not capture the transitory time response of the MPC during the training process. The trained neural networks only approximate the MPC’s optimal response, especially during transient phases.

6.7.3. Constraint Handling

One major advantage of the MPC framework is the possibility to include constraints on the states and control the inputs of the system. The goal here is to verify if this property is also respected by the NN-based controller trained using the MPC’s input–ouput signals. Figure 13 shows that the constraint on the tilt angle

χ_{n n s} \in [- \frac{π}{18}, \frac{π}{2}]

rad is respected, as well as the constraint on the commanded thrust

T_{n n s} \in [0, 48]

N. The thrust vector attitude controller presented in Section 5.2.2 has also been tuned such that the constraints on the commanded torque, namely

| | M_{n n s}^{B} | |

, remain within the range

[- 2, + 2]

N m, which is well respected. The respect of all these constraints by the NN-based controller has been extensively verified using a variety of different trajectories and flight conditions.

On the other hand, the constraint on the tilt angle rate

{\dot{χ}}_{n n s} \in [- \frac{π}{2}, + \frac{π}{2}]

rad

s^{- 1}

is respected in the velocity ramp

{MATLAB}^{®}

simulations presented in Figure 12 and Figure 13. However, depending on the disturbance acting on the system, this constraint might be violated. This is because the neural networks of the NN-based flight controller are not being explicitly trained to handle the tilt angle rate; i.e.,

{\dot{χ}}_{n n s}

does not appear among the outputs of either the main NN or the tilt NN. In contrast, the MPC controller explicitly considers the constraint on the tilt angle rate in its optimization loop. For this reason, in the actual implementation of the NN-based controller, a saturation function on the tilt angle rate is inserted.

6.8. Experimental Results

6.8.1. Real Flight Test Setup and Trajectories

The NN-based controller was tested in real-flight tests. Its C++ implementation ran on an Intel UpBoard, which received the input signal

X^{i n}

from a Pixhawk autopilot through a serial communication. The computed NN-based controller output signal

X_{c m d}^{o u t}

was then sent from the Intel UpBoard back to the Pixhawk autopilot, where the quaternion-attitude controller and the control allocation were running. As described in Section 6.7, the NN-based controller ran at 50 Hz.

The real flight trajectory consists of two consecutive velocity ramp trajectories. The flight logs collected during the real flight mission are shown in Figure 14, Figure 15 and Figure 16. As indicated above, the NN-based controller was trained to handle the reference signals for the longitudinal axis, and in particular to optimally track

v_{n, r e f}

and

v_{d, r e f}

The video showing the offboard recording of the flight of this section can be viewed at https://youtube.com/shorts/CqcCl8aeHc8 (accessed on 17 November 2024).

Acceleration Phase

For safety reasons, a maximum acceleration of

\pm 1

s^{2}

and a maximum speed of 10 m/s are allowed. During the forward acceleration phase, the real flight plots in Figure 15 show that:

the vehicle first pitches downward using differential propeller thrust in order to accelerate forward and to gain positive north airspeed in the time range $t \in [0 - 7]$ s,
then, the propellers tilt forward, while at the same time, the fuselage pitches up and levels using the tail elevator control surface,
the propeller tilt angle increases (i.e., becomes more horizontal) and the thrust decreases as the lift forces are increasingly generated by the two wings. A positive angle of attack allows some wing lift force to be created in order to compensate for the vehicle’s weight, while the total thrust compensates for the drag force in a level cruise flight.

As shown in the top plot of Figure 14, the achieved forward velocity

v_{n}

follows the reference signal

v_{n, r e f}

Altitude Velocity Tracking

As shown in the bottom plot in Figure 14, the NN-based controller is able to track the vertical velocity

v_{d, r e f}^{I}

with an average velocity error of less than 1 m/s. The NN controller increases the UAV altitude when a negative velocity step is commanded by the user in the time range t = 15–20 s.

Deceleration

In the deceleration phases, the pitch angle increases to augment the drag forces. At the same time, the tilt angle goes back to an RW configuration while the propellers spin up again to provide sufficient thrust to guarantee velocity tracking.

Lateral Path Tracking

A deviation in the west direction is visible in the middle plot of Figure 14, where the achieved velocity (solid line) in the east axis is mostly negative. This is probably caused by a sideways wind disturbance, which the NN controller is unable to effectively counteract. This is not surprising, as the lateral velocity tracking was not included in the training in the data generation procedure, as anticipated in Section 5.3 and shown in Figure 8.

6.8.2. Computational-Cost of MPC vs. NN-Based Controller

The main contribution of this paper is the replacement of the “teacher” -but computationally intense- MPC controller with a student NN-based controller, trained to mimic the MPC controller, with the potential to reduce significantly the computational load, which can be advantageously exploited to:

increase the frequency of the control algorithm and thus improve the disturbance-rejection property of the controller,
use a less powerful, less energy-consuming and less expensive companion computer while keeping the controller’s frequency unchanged.

Recorded Performance in Real Tests

Table 8 compares the average computation time of the MPC loop vs. the NN-based control loop during two analogous real flight tests. On average, the inference time of the NN-based controller, which includes the main NN, tilt NN, thrust-correction block, conversion block, and thrust vector attitude controller, is about 25% of the time required to run the corresponding MPC loop. Thus, a reduction of 75% of the computational time is achieved. Therefore, in the

{MATLAB}^{®}

simulations and the real tests presented in this paper, a higher frequency for the NN-based controller is adopted compared to the one for the MPC algorithm. Indeed, for the MPC, the sampling time is set to 50 ms (i.e., a frequency of 20 Hz), while for the NN-based controller, a sampling time of 20 ms (i.e., a frequency of 50 Hz) is considered. This sample time includes a buffer period to account for variations in the computational time and delays in the communication protocol.

Optimization Aspects

The

{MATLAB}^{®}

Coder toolbox is used to automatically generate a static library containing the evaluation of the trained neural networks. However, the code generated with this conversion procedure is not optimized, and the inference time of the neural networks could be reduced further. The

{MATLAB}^{®}

documentation suggests the deployment of specialized libraries for the inference of the trained neural networks in a C++ environment, e.g., the Intel Math Kernel Library for Deep Neural Networks (MKL-DNN). The employment of these optimized libraries would strongly decrease the inference time of deep networks. However, the CPU installed on the Intel UpBoard considered in this paper does not feature a suitable architecture meeting the requirements for the MKL-DNN. Other optimization libraries for MATLAB-trained NNs are available for other common embedded platforms such as NVidia Jetson, or Raspberry Pi 4. These will be investigated in our future research work.

In this paper, the goal is not to optimize the neural network inference but to demonstrate the feasibility of training neural networks to generate high-level commands to steer a hybrid-VTOL aircraft for a complete longitudinal flight mission, including take off, hover, transition from the RW to FW mode, transition back from the FW to RW mode, hover, and landing. This proof of concept is highlighted in the results of Table 8 and Figure 14, Figure 15 and Figure 16, which show the effectiveness of the imitative learning approach for tilt-rotor VTOL UAVs, in particular in the longitudinal axis.

7. Conclusions

In this paper, using an imitative learning approach, a student NN-based controller is trained with the input–output pairs generated by a teacher MPC controller. This paper newly presents successful real-life experiments of a tilt-rotor VTOL UAV controlled by an NN-based algorithm, which completely replaces the MPC controller that was used to train this NN-based controller. The latter is able to control both the vertical and longitudinal axes during hover and transitions between RW and FW modes with satisfactory performance, despite the presence of wind disturbances. This NN-based controller features two separate NNs to minimize the number of MPC inputs/outputs that each neural networks must learn and to avoid couplings between the input–output pairs. A thrust correction block has been added to mitigate the NN controller learning imperfections. The NN architectures considered included one, two, three, or four inner layers, where, for each layer,

{16, 32, 64, 128, 256, 512}

neurons were considered.

The architecture of each neural network has been selected to provide the best compromise between NN complexity and achieved performance. Two metrics were used to assess such performance, namely

the root mean square error (RMSE) between NN-based controller outputs and MPC outputs, which measures how well the NN controller mimics the MPC behavior,
and the mean absolute velocity tracking error (MAVTE), assessing the ability of either the NN-based or MPC-based controller to track the reference velocity during an acceleration–deceleration sequence, respectively.

For this hybrid tilt-rotor VTOL UAV, the best combination in terms of architecture for the main NN and tilt NN among all the architectures tried in simulations is the one that gives the best MAVTE value (0.67 m/s in that case) for velocity ramp tracking. This value is 2.9% smaller than the one obtained with MPC for the same desired velocity reference trajectory. This optimal combination corresponds to the following two neural networks:

Main NN for UAV’s attitude control: feedforward fully connected network, 14 inputs, 3 hidden layers with 128 neurons per layer and ReLu activation function, 4 outputs with a htan activation function. Learning rate = 0.0001. Achieved RMSE = 0.03643 in a range of [0.03272–0.05434]. Learning time needed: 90 min.
Tilt NN for propeller tilt angle control: feedforward fully connected network, 7 inputs, 4 hidden layers with [128,64,32,16] neurons per layer and ReLu activation function, 1 output with a htan activation function. Learning rate = 0.001. Achieved RMSE = 0.05565 in a range of [0.05565–0.10455]. Learning time needed: 82 min.

The above combination runs at 50 Hz, i.e., 2.5 times faster than the MPC controller it replaces. In addition, this NN controller has the following properties:

It approximates the MPC’s solution while reducing the computational cost by 75%.
Its velocity tracking performance is similar to the MPC in terms of MAVTE. Due to a higher execution frequency, it has the potential for a more effective disturbance rejection.
Simulations and real flight tests show that it respects the states and control inputs constraints of the MPC formulation.

8. Outlook

Our future works will include the following:

The usage of recurrent neural networks (e.g., LSTM networks) in an attempt to teach the NN-based controller to generate a response closer to the one of the MPC teacher.
Improving the lateral-control performance; this will be done by expanding the training dataset with simulated lateral motion trajectories under MPC control.
Employing a different embedded computer, allowing libraries to speed up the NN inference
Combining the current NN-based control architecture with our online learning approaches that are specifically designed for disturbance rejection [68,69].

Author Contributions

Project definition, G.D.; conceptualization, G.D. and G.C.; methodology, G.C. and G.D.; software, G.C. and G.D.; validation, G.C. and G.D.; formal analysis, G.C. and G.D.; investigation, G.C. and G.D.; resources, G.C. and G.D.; data curation, G.C. and G.D.; writing—original draft preparation, G.C. and G.D.; writing—review and editing, G.D.; visualization, G.C. and G.D.; supervision, G.D.; project administration, G.D.; funding acquisition, G.D. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

A video of the practical experiments can be found at https://youtu.be/lbskT6JhBVo (accessed on 17 November 2024).

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

CoM	Center of Mass
FCU	Flight Control Unit
FF	Feedforward
FFNN	Feedforward Neural Network
FW	Fixed Wing
LSTM	Long Short-Term Memory
MAVTE	Mean Absolute Velocity Tracking Error
MKL-DNN	Math Kernel Library for Deep Neural Networks
MPC	Model Predictive Control
NED	North, East, Down
NN	Neural Network
RC	Remote Controller
RMSE	Root Mean Square Error
RNN	Recursive Neural Network
RW	Rotary Wing
UAV	Unmanned Aerial Vehicle
VTOL	Vertical Take Off and Landing

References

Allenspach, M.; Ducard, G.J.J. Nonlinear model predictive control and guidance for a propeller-tilting hybrid unmanned air vehicle. Automatica 2021, 132, 109790. [Google Scholar] [CrossRef]
Ducard, G.; Allenspach, M. Review of designs and flight control techniques of hybrid and convertible VTOL UAVs. Aerosp. Sci. Technol. 2021, 118, 107035. [Google Scholar] [CrossRef]
Yu, L.; Zhang, D.; Zhang, J. Transition flight modeling and control of a novel tilt tri-rotor UAV. In Proceedings of the 2017 IEEE International Conference on Information and Automation (ICIA), Macao, China, 18–20 July 2017; pp. 983–988. [Google Scholar] [CrossRef]
Flores, G.; Escareno, J.; Lozano, R.; Salazar, S. Quad-Tilting Rotor Convertible MAV: Modeling and Real-time Hover Flight Control. J. Intell. Robot. Syst. 2012, 65, 457–471. [Google Scholar] [CrossRef]
Cetinsoy, E. Design and control of a gas-electric hybrid quad tilt-rotor UAV with morphing wing. In Proceedings of the 2015 International Conference on Unmanned Aircraft Systems (ICUAS), Denver, CO, USA, 9–12 June 2015; pp. 82–91. [Google Scholar] [CrossRef]
Shen, D.; Lu, Q.; Hu, M.; Kong, Z. Mathematical Modeling and Control of the Quad Tilt-Rotor UAV. In Proceedings of the 2018 IEEE 8th Annual International Conference on CYBER Technology in Automation, Control, and Intelligent Systems (CYBER), Tianjin, China, 19–23 July 2018; pp. 1220–1225. [Google Scholar] [CrossRef]
Nakamura, Y.; Arakawa, A.; Watanabe, K.; Nagai, I. Transitional Flight Simulations for a Tilted Quadrotor with a Fixed-wing. In Proceedings of the 2018 IEEE International Conference on Mechatronics and Automation (ICMA), Changchun, China, 5–8 August 2018; pp. 1829–1836. [Google Scholar] [CrossRef]
Yanguo, S.; Huanjin, W. Design of Flight Control System for a Small Unmanned Tilt Rotor Aircraft. Chin. J. Aeronaut. 2009, 22, 250–256. [Google Scholar] [CrossRef]
Peng, C.-C.; Hwang, T.-S.; Chen, S.-W.; Chang, C.-Y.; Lin, Y.-C.; Wu, Y.-T.; Lin, Y.-H.J.; Lai, W.R. ZPETC Path-Tracking gain-scheduling design and real-time multi-task flight simulation for the automatic transition of tilt-rotor aircraft. In Proceedings of the 2010 IEEE Conference on Robotics, Automation and Mechatronics, Singapore, 28–30 June 2010; pp. 118–123. [Google Scholar] [CrossRef]
Park, S.; Bae, J.; Kim, Y.; Kim, S. Fault tolerant flight control system for the tilt-rotor UAV. J. Frankl. Inst. 2013, 350, 2535–2559. [Google Scholar] [CrossRef]
Hernández-García, R.G.; Rodríguez-Cortés, H. Transition flight control of a cyclic tiltrotor UAV based on the Gain-Scheduling strategy. In Proceedings of the 2015 International Conference on Unmanned Aircraft Systems (ICUAS), Denver, CO, USA, 9–12 June 2015; pp. 951–956. [Google Scholar] [CrossRef]
Sun, Z.; Wang, R.; Zhou, W. Finite-time stabilization control for the flight mode transition of tiltrotors based on switching method. In Proceedings of the 2017 29th Chinese Control And Decision Conference (CCDC), Chongqing, China, 28–30 May 2017; pp. 2049–2053. [Google Scholar] [CrossRef]
Dai, C.; Bai, H.; Zeng, J. Nonlinear stabilization control of tilt rotor UAV during transition flight based on HOSVD. In Proceedings of the 2016 IEEE Chinese Guidance, Navigation and Control Conference (CGNCC), Nanjing, China, 12–14 August 2016; pp. 154–159. [Google Scholar] [CrossRef]
Cakici, F. Modeling, Stability Analysis and Control System Design of a Small-Sized Tiltrotor UAV. Master’s Thesis, Middle East Technical University, Ankara, Türkiye, 2009. [Google Scholar]
Cakici, F.; Leblebicioglu, K. Modeling and simulation of a small-sized Tiltrotor UAV. J. Def. Model. Simul. Appl. Methodol. Technol. 2012, 9, 335–345. [Google Scholar] [CrossRef]
Lin, H.; Fu, R.; Zeng, J. Extended state observer based sliding mode control for a tilt rotor UAV. In Proceedings of the 2017 36th Chinese Control Conference (CCC), Dalian, China, 26–28 July 2017; pp. 3771–3775. [Google Scholar] [CrossRef]
Zhu, X.P.; Fan, Y.H.; Yang, J. Design of Tiltrotor Flight Control System Using Fuzzy Sliding Mode Control. In Proceedings of the 2010 International Conference on Measuring Technology and Mechatronics Automation, Dalian, China, 26–28 July 2017; Volume 1, pp. 1060–1063. [Google Scholar] [CrossRef]
Verling, S.; Zilly, J. Modeling and Control of a VTOL Glider. Master’s Thesis, Swiss Federal Institute of Technology Zurich, Zürich, Switzerland, 2013. [Google Scholar]
Bauersfeld, L.; Ducard, G. Fused-PID Control for Tilt-Rotor VTOL Aircraft. In Proceedings of the 2020 28th Mediterranean Conference on Control and Automation (MED), Saint Raphaël, France, 15–18 September 2020; pp. 703–708. [Google Scholar] [CrossRef]
Liu, Z.; Theilliol, D.; Yang, L.; He, Y.; Han, J. Transition control of tilt rotor unmanned aerial vehicle based on multi-model adaptive method. In Proceedings of the 2017 International Conference on Unmanned Aircraft Systems (ICUAS), Miami, FL, USA, 13–16 June 2017; pp. 560–566. [Google Scholar] [CrossRef]
Yangping, D.; Honggang, G. Transition Flight Control and Test of a New Kind Tilt Prop Box-Wing VTOL UAV. In Proceedings of the 2018 9th International Conference on Mechanical and Aerospace Engineering (ICMAE), Budapest, Hungary, 10–13 July 2018; p. 90. [Google Scholar] [CrossRef]
Sun, J.; Yang, J.; Zhu, X. Robust Flight Control Law Development for Tiltrotor Conversion. In Proceedings of the 2009 International Conference on Intelligent Human-Machine Systems and Cybernetics, Hangzhou, China, 26–27 August 2009; Volume 2, pp. 481–484. [Google Scholar] [CrossRef]
Apkarian, J. Pitch-decoupled VTOL/FW aircraft: First flights. In Proceedings of the 2017 Workshop on Research, Education and Development of Unmanned Aerial Systems (RED-UAS), Linköping, Sweden, 3–5 October 2017; pp. 258–263. [Google Scholar] [CrossRef]
Hegde, N.T.; George, V.I.; Nayak, C.G. Modelling and Transition flight control of Vertical Take-Off and Landing unmanned Tri-Tilting Rotor Aerial Vehicle. In Proceedings of the 2019 3rd International conference on Electronics, Communication and Aerospace Technology (ICECA), Coimbatore, India, 12–14 June 2019; pp. 590–594. [Google Scholar] [CrossRef]
Chiappinelli, R.; Cohen, M.; Doff-Sotta, M.; Nahon, M.; Forbes, J.R.; Apkarian, J. Modeling and Control of a Passively-Coupled Tilt-Rotor Vertical Takeoff and Landing Aircraft. In Proceedings of the 2019 International Conference on Robotics and Automation (ICRA), Montreal, QC, Canada, 20–24 May 2019; pp. 4141–4147. [Google Scholar] [CrossRef]
Kang, Y.; Park, B.; Yoo, C.; Kim, Y.; Koo, S. Flight test results of automatic tilt control for small scaled tilt rotor aircraft. In Proceedings of the 2008 International Conference on Control, Automation and Systems, Seoul, Republic of Korea, 14–17 October 2008; pp. 47–51. [Google Scholar] [CrossRef]
Lee, J.-H.; Min, B.-M.; Kim, E.-T. Autopilot design of tilt-rotor UAV using particle swarm optimization method. In Proceedings of the 2007 International Conference on Control, Automation and Systems, Seoul, Republic of Korea, 17–20 October 2007; pp. 1629–1633. [Google Scholar] [CrossRef]
Cardoso, D.N.; Esteban, S.; Raffo, G.V. A Nonlinear W_∞ Controller of a Tilt-rotor UAV for trajectory tracking. In Proceedings of the 2019 18th European Control Conference (ECC), Naples, Italy, 25–28 June 2019; pp. 928–934. [Google Scholar] [CrossRef]
Liu, Z.; Theilliol, D.; Yang, L.; He, Y.; Han, J. Observer-based linear parameter varying control design with unmeasurable varying parameters under sensor faults for quad-tilt rotor unmanned aerial vehicle. Aerosp. Sci. Technol. 2019, 92, 696–713. [Google Scholar] [CrossRef]
Zhao, W.; Underwood, C. Robust transition control of a Martian coaxial tiltrotor aerobot. Acta Astronaut. 2014, 99, 111–129. [Google Scholar] [CrossRef]
Yin, Y.; Niu, H.; Liu, X. Adaptive Neural Network Sliding Mode Control for Quad Tilt Rotor Aircraft. Complexity 2017, 2017, 7104708. [Google Scholar] [CrossRef]
Ta, D.A.; Fantoni, I.; Lozano, R. Modeling and control of a tilt tri-rotor airplane. In Proceedings of the 2012 American Control Conference (ACC), Montreal, QC, Canada, 27–29 June 2012; pp. 131–136. [Google Scholar] [CrossRef]
Flores, G.; Lozano, R. Transition flight control of the quad-tilting rotor convertible MAV. In Proceedings of the 2013 International Conference on Unmanned Aircraft Systems (ICUAS), Atlanta, GA, USA, 28–31 May 2013; pp. 789–794. [Google Scholar] [CrossRef]
Kim, B.M.; Kim, B.; Kim, N. Trajectory tracking controller design using neural networks for a tiltrotor unmanned aerial vehicle. Proc. Inst. Mech. Eng. Part G-J. Aerosp. Eng.—Proc. Inst. Mech. Eng. G-J A E 2010, 224, 881–896. [Google Scholar] [CrossRef]
Lin, Q.; Cai, Z.; Yang, J.; Sang, Y.; Wang, Y. Trajectory tracking control for hovering and acceleration maneuver of Quad Tilt Rotor UAV. In Proceedings of the 33rd Chinese Control Conference, Nanjing, China, 28–30 July 2014; pp. 2052–2057. [Google Scholar] [CrossRef]
Yu, C.; Zhu, J.; Sun, Z. Nonlinear adaptive internal model control using neural networks for tilt rotor aircraft platform. In Proceedings of the 2005 IEEE Midnight-Summer Workshop on Soft Computing in Industrial Applications, 2005. SMCia/05, Espoo, Finland, 28–30 June 2005; pp. 12–16. [Google Scholar] [CrossRef]
Yatsun, A.; Lushnikov, B.; Emelyanova, O. Motion Control Automation in the Quadcopter Convertiplane in a Transient Mode. In Proceedings of the 2018 International Russian Automation Conference (RusAutoCon), Sochi, Russia, 9–16 September 2018; pp. 1–6. [Google Scholar] [CrossRef]
Francesco, G.; D’Amato, E.; Mattei, M. INDI Control with Direct Lift for a Tilt Rotor UAV. IFAC-PapersOnLine 2015, 48, 156–161. [Google Scholar] [CrossRef]
Fang, X.; Lin, Q.; Wang, Y.; Zheng, L. Control strategy design for the transitional mode of tiltrotor UAV. In Proceedings of the IEEE 10th International Conference on Industrial Informatics, Beijing, China, 25–27 July 2012; pp. 248–253. [Google Scholar] [CrossRef]
Rysdyk, R.T.; Calise, A.J. Adaptive Model Inversion Flight Control for Tilt-Rotor Aircraft. J. Guid. Control. Dyn. 1999, 22, 402–407. [Google Scholar] [CrossRef]
Rysdyk, R.T.; Calise, A.J. Adaptive nonlinear control for tiltrotor aircraft. In Proceedings of the Proceedings of the 1998 IEEE International Conference on Control Applications (Cat. No.98CH36104), Trieste, Italy, 4 September 1998; Volume 2, pp. 980–984. [Google Scholar] [CrossRef]
D′Amato, E.; Francesco, G.; Notaro, I.; Tartaglione, G.; Mattei, M. Nonlinear Dynamic Inversion and Neural Networks for a Tilt Tri-Rotor UAV. IFAC-PapersOnLine 2015, 48, 162–167. [Google Scholar] [CrossRef]
Kang, Y.; Kim, N.; Kim, B.; Tahk, M.J. Autopilot design for tilt-rotor unmanned aerial vehicle with nacelle mounted wing extension using single hidden layer perceptron neural network. Proc. Inst. Mech. Eng. Part G J. Aerosp. Eng. 2016, 231. [Google Scholar] [CrossRef]
Schlatter, M.; Ducard, G.; Rohr, D.; Onder, C. Longitudinal Control of a Tilt-rotor VTOL UAV using Incremental Nonlinear Dynamic Inversion. In Proceedings of the 2024 International Conference on Control, Automation and Diagnosis (ICCAD), Paris, France, 15–17 May 2024; pp. 1–6. [Google Scholar] [CrossRef]
Pan, Z.; Chi, C.; Zhang, J. Nonlinear Attitude Control of Tiltrotor Aircraft in Helicopter Mode based on ADRSM Theory. In Proceedings of the 2018 37th Chinese Control Conference (CCC), Wuhan, China, 25–27 July 2018; pp. 9962–9967. [Google Scholar] [CrossRef]
Yu, L.; He, G.; Zhao, S.; Wang, X. Dynamic Inversion-Based Sliding Mode Control of a Tilt Tri-Rotor UAV. In Proceedings of the 2019 12th Asian Control Conference (ASCC), Kitakyushu, Japan, 9–12 June 2019; pp. 1637–1642. [Google Scholar]
Flores, G.; Lugo, I.; Lozano, R. 6-DOF hovering controller design of the Quad Tiltrotor aircraft: Simulations and experiments. In Proceedings of the 53rd IEEE Conference on Decision and Control, Los Angeles, CA, USA, 15–17 December 2014; pp. 6123–6128. [Google Scholar] [CrossRef]
Flores-Colunga, G.R.; Lozano-Leal, R. A Nonlinear Control Law for Hover to Level Flight for the Quad Tilt-rotor UAV. IFAC Proc. Vol. 2014, 47, 11055–11059. [Google Scholar] [CrossRef]
Bauersfeld, L.; Spannagl, L.; Ducard, G.; Onder, C. MPC Flight Control for a Tilt-rotor VTOL Aircraft. IEEE Trans. Aerosp. Electron. Syst. 2021, 57, 2395–2409. [Google Scholar] [CrossRef]
Zhou, J.; Xu, H.; Li, Z.; Shen, S.; Zhang, F. Control of a Tail-Sitter VTOL UAV Based on Recurrent Neural Networks. arXiv 2021, arXiv:2104.02108. [Google Scholar]
Dubach, M.; Ducard, G.J. A Comparison of Verification Methods for Neural-Network Controllers Using Mixed-Integer Programs. In Proceedings of the 2022 7th International Conference on Robotics and Automation Engineering (ICRAE), Singapore, 18–20 November 2022; pp. 43–48. [Google Scholar] [CrossRef]
Ducard, G.; Hua, M.D. Modeling of an unmanned hybrid aerial vehicle. In Proceedings of the 2014 IEEE Conference on Control Applications (CCA), Antibes, France, 22–25 April 2024; pp. 1011–1016. [Google Scholar]
Vukov, M.; Domahidi, A.; Ferreau, H.J.; Morari, M.; Diehl, M. Auto-generated algorithms for nonlinear model predictive control on long and on short horizons. In Proceedings of the 52nd IEEE Conference on Decision and Control, Firenze, Italy, 10–13 December 2013; pp. 5113–5118. [Google Scholar]
Houska, B.; Ferreau, H.J.; Diehl, M. An auto-generated real-time iteration algorithm for nonlinear MPC in the microsecond range. Automatica 2011, 47, 2279–2285. [Google Scholar] [CrossRef]
Brescianini, D.; D’Andrea, R. Tilt-prioritized quadrocopter attitude control. IEEE Trans. Control. Syst. Technol. 2018, 28, 376–387. [Google Scholar] [CrossRef]
Spannagl, L.; Ducard, G. Control Allocation for an Unmanned Hybrid Aerial Vehicle. In Proceedings of the 2020 28th Mediterranean Conference on Control and Automation (MED), Saint-Raphael, France, 15–18 September 2020; pp. 709–714. [Google Scholar] [CrossRef]
Ducard, G.; Hua, M.D. WCA: A New Efficient Nonlinear Adaptive Control Allocation for Planar Hexacopters. IEEE Access 2023, 11, 37714–37748. [Google Scholar] [CrossRef]
Bagnell, J.A. An Invitation to Imitation; Technical Report; Carnegie-Mellon University, Robotics Institute: Pittsburgh, PA, USA, 2015. [Google Scholar]
Omari, S.; Hua, M.D.; Ducard, G.; Hamel, T. Hardware and software architecture for nonlinear control of multirotor helicopters. IEEE/ASME Trans. Mechatronics 2013, 18, 1724–1736. [Google Scholar] [CrossRef]
Rudin, K.; Hua, M.D.; Ducard, G.; Bouabdallah, S. A robust attitude controller and its application to quadrotor helicopters. IFAC Proc. Vol. 2011, 44, 10379–10384. [Google Scholar] [CrossRef]
Tayebi, A.; McGilvray, S. Attitude stabilization of a VTOL quadrotor aircraft. IEEE Trans. Control. Syst. Technol. 2006, 14, 562–571. [Google Scholar] [CrossRef]
Kaufmann, E.; Loquercio, A.; Ranftl, R.; Müller, M.; Koltun, V.; Scaramuzza, D. Deep drone acrobatics. arXiv 2020, arXiv:2006.05768. [Google Scholar]
Luo, J.; Ying, K.; Bai, J. Savitzky–Golay smoothing and differentiation filter for even number data. Signal Process. 2005, 85, 1429–1434. [Google Scholar] [CrossRef]
Li, S.; Öztürk, E.; De Wagter, C.; De Croon, G.C.; Izzo, D. Aggressive online control of a quadrotor via deep network representations of optimality principles. In Proceedings of the 2020 IEEE International Conference on Robotics and Automation (ICRA), Paris, France, 31 May–31 August 2020; pp. 6282–6287. [Google Scholar]
Sarabakha, A.; Kayacan, E. Online deep learning for improved trajectory tracking of unmanned aerial vehicles using expert knowledge. In Proceedings of the 2019 International Conference on Robotics and Automation (ICRA), Montreal, QC, Canada, 20–24 May 2019; pp. 7727–7733. [Google Scholar]
Tailor, D.; Izzo, D. Learning the optimal state-feedback via supervised imitation learning. Astrodynamics 2019, 3, 361–374. [Google Scholar] [CrossRef]
Zhang, T.; Kahn, G.; Levine, S.; Abbeel, P. Learning deep control policies for autonomous aerial vehicles with mpc-guided policy search. In Proceedings of the 2016 IEEE international conference on robotics and automation (ICRA), Stockholm, Sweden, 16–21 May 2016; pp. 528–535. [Google Scholar]
Carughi, G.; Ducard, G.; Onder, C. Online Neural-Network Learning and Model Predictive Control Applied to a Tilt-Rotor Unmanned Aerial Vehicle. In Proceedings of the 2022 IEEE 17th International Conference on Control & Automation (ICCA), Naples, Italy, 27–30 June 2022; pp. 31–37. [Google Scholar]
Dezons, L.; Ducard, G. Online Learning for Improved Attitude Control of a Tilt-rotor Hybrid VTOL UAV. In Proceedings of the 2023 International Conference on Control, Automation and Diagnosis (ICCAD), Rome, Italy, 10–12 May 2023; pp. 1–6. [Google Scholar] [CrossRef]

Figure 1. The hybrid tilt-rotor VTOL UAV considered in this paper [1]. Left: the vehicle in helicopter or “rotary-wing” (RW) mode. Right: the vehicle in “fixed-wing” (FW) mode.

Figure 2. From left to right: successive phases of a tilt-rotor VTOL UAV transitioning from hover mode to cruise mode, while ideally keeping its altitude constant [2].

Figure 3. Scheduling policies where the scheduling variable is the velocity

v (t)

as an example. (a) Divide and conquer; (b) control authority weighting.

Figure 3. Scheduling policies where the scheduling variable is the velocity

v (t)

as an example. (a) Divide and conquer; (b) control authority weighting.

Figure 4. The tilt-rotor VTOL UAV considered in this work [52]. The aircraft is in the RW configuration with propeller tilt angles of

χ_{l} = χ_{r} = 0

rad.

Figure 4. The tilt-rotor VTOL UAV considered in this work [52]. The aircraft is in the RW configuration with propeller tilt angles of

χ_{l} = χ_{r} = 0

rad.

Figure 5. A left-side section view of the aircraft in the FW mode [1]: the four propellers are tilted with

χ_{l} = χ_{r} = \frac{π}{2}

rad.

Figure 5. A left-side section view of the aircraft in the FW mode [1]: the four propellers are tilted with

χ_{l} = χ_{r} = \frac{π}{2}

rad.

Figure 6. Control architecture of the tilt-rotor VTOL UAV with MPC.

Figure 7. Control architecture of the tilt-rotor VTOL UAV with the NN controller (orange dashed line rectangle, running at 50 Hz) replacing the MPC (compare with Figure 6). The quaternion attitude controller and the control allocation (blue dashed line rectangle) run with a frequency of 250 Hz.

Figure 8. An example of a simulated trajectory employed during the data generation process. The dashed lines correspond to the reference signals provided at the input of the MPC controller. The continuous lines correspond to the achieved velocities of the VTOL UAV in the north, east, and down axes, respectively.

Figure 9.

{MATLAB}^{®}

simulation results: velocity tracking with the MPC controller for a velocity ramp trajectory. Background color: acceleration phase in light blue, deceleration phase in yellow.

Figure 9.

{MATLAB}^{®}

simulation results: velocity tracking with the MPC controller for a velocity ramp trajectory. Background color: acceleration phase in light blue, deceleration phase in yellow.

Figure 10. Main NN, 15 epochs—Architecture and training delivering the smallest tracking error on the velocity ramp trajectory. The neural network has fourteen inputs,

X^{i n_{m a i n}} = {[x^{⊤}, {(v_{r e f}^{I})}^{⊤}]}^{⊤} \in R^{14}

, and four outputs,

X_{n n s}^{o u t_{m a i n}} = {[T_{n n s}, {(Υ_{IB, n n s})}^{⊤}]}^{⊤} \in R^{4}

Figure 10. Main NN, 15 epochs—Architecture and training delivering the smallest tracking error on the velocity ramp trajectory. The neural network has fourteen inputs,

X^{i n_{m a i n}} = {[x^{⊤}, {(v_{r e f}^{I})}^{⊤}]}^{⊤} \in R^{14}

, and four outputs,

X_{n n s}^{o u t_{m a i n}} = {[T_{n n s}, {(Υ_{IB, n n s})}^{⊤}]}^{⊤} \in R^{4}

Figure 11. Tilt NN, 15 epochs—Architecture and training delivering the smallest tracking error on the velocity ramp trajectory. The neural network has seven inputs,

X^{i n_{t i l t}} = {[{(v^{I})}^{⊤}, θ, {(v_{r e f}^{I})}^{⊤}]}^{⊤} \in R^{7}

, and one output,

X_{n n s}^{o u t_{t i l t}} = [χ_{n n s}] \in R^{1}

Figure 11. Tilt NN, 15 epochs—Architecture and training delivering the smallest tracking error on the velocity ramp trajectory. The neural network has seven inputs,

X^{i n_{t i l t}} = {[{(v^{I})}^{⊤}, θ, {(v_{r e f}^{I})}^{⊤}]}^{⊤} \in R^{7}

, and one output,

X_{n n s}^{o u t_{t i l t}} = [χ_{n n s}] \in R^{1}

Figure 12.

{MATLAB}^{®}

simulations. Comparison between the MPC controller and the NN-based controller regarding velocity tracking performance. Background color: acceleration phase in light blue, deceleration phase in yellow.

Figure 12.

{MATLAB}^{®}

Figure 13.

{MATLAB}^{®}

simulations. Comparison between the MPC controller and the NN-based controller regarding the achieved pitch angle

θ

, commanded tilt angle

χ_{c m d}

, and commanded thrust

T_{c m d}

. Background color: acceleration phase in light blue, deceleration phase in yellow.

Figure 13.

{MATLAB}^{®}

simulations. Comparison between the MPC controller and the NN-based controller regarding the achieved pitch angle

θ

, commanded tilt angle

χ_{c m d}

, and commanded thrust

T_{c m d}

. Background color: acceleration phase in light blue, deceleration phase in yellow.

Figure 14. Real flight test of the hybrid VTOL UAV in Figure 1 controlled by the NN-based controller. North, east, and down velocity tracking performance. Background color: acceleration phase in light blue, deceleration phase in yellow.

Figure 15. Real flight test of hybrid VTOL UAV of Figure 1 controlled by the NN-based controller. Measured pitch angle

θ

, commanded tilt angle

χ_{n n s}

, and commanded thrust

T_{n n s, t o t}

. Background color: acceleration phase in light blue, deceleration phase in yellow.

Figure 15. Real flight test of hybrid VTOL UAV of Figure 1 controlled by the NN-based controller. Measured pitch angle

θ

, commanded tilt angle

χ_{n n s}

, and commanded thrust

T_{n n s, t o t}

. Background color: acceleration phase in light blue, deceleration phase in yellow.

Figure 16. Real flight test of the NN-based controller. Comparison between commanded and measured Euler angles

Υ_{IB}

. Background color: acceleration phase in light blue, deceleration phase in yellow.

Figure 16. Real flight test of the NN-based controller. Comparison between commanded and measured Euler angles

Υ_{IB}

. Background color: acceleration phase in light blue, deceleration phase in yellow.

Table 1. Control methods for tilt-rotor VTOL UAVs.

Control Architecture	Main Approach	References
Combined flight mode-dependent controllers	Divide and conquer with P/PD/PID	[3,4,5,6,7]
	Divide and conquer with LQR	[8,9,10,11,12,13,14,15]
	Divide and conquer with SMC	[16,17]
	Control authority weighting with P/PD/PID	[4,18,19]
	Control authority weighting with LQR	[20]
Unified control approach through the full flight envelop	Robust control	[21,22,23,24,25,26,27,28]
	Linear parameter varying with $H_{\infty}$	[29,30]
	Direct gain scheduling with SMC	[31]
	Direct gain scheduling with P/PD/PID	[32]
	Dynamic inversion with P/PD/PID	[33,34,35,36,37,38,39,40,41,42,43,44]
	Dynamic inversion with SMC	[45,46]
	Dynamic inversion with backstepping	[47,48]
	Nonlinear model predictive control	[1,49]

Table 2. Variables and conventions.

Symbol	Description
$I : {O_{I}, e_{n}, e_{e}, e_{d}}$	North–East–Down (NED) inertial frame
$B : {O_{B}, e_{x}, e_{y}, e_{z}}$	Body frame centered at the aircraft’s CoM $O_{B}$
$R_{i} : {O_{R, i}, e_{1}, e_{2}, e_{3}}$	Rotor arm frame $i = 1 \dots 4$ , $e_{1} = {[1 0 0]}^{⊤}$ , $e_{2} = {[0 1 0]}^{⊤}$ , $e_{3} = {[0 0 1]}^{⊤}$
$p^{I} \in R^{3}$	Aircraft’s CoM position in $I$
$v^{I} \in R^{3}$	Aircraft’s CoM velocity in $I$
$q_{B}^{I, n} \in Q$	Attitude quaternion of $B$ with respect to $I$ , ∈ the quaternion group $Q$ , expressed in the navigation frame n
$R_{B}^{I} \in S O (3)$	Rotation matrix of $B$ with respect to $I$
$R_{R_{i}}^{B} \in S O (3)$	Rotation matrix of $R_{i}$ with respect to $B$
$ω_{B / I}^{B} \in R^{3}$	Angular velocity of $B$ with respect to $I$ expressed in $B$
${[u]}_{\times}$	Skew-symmetric matrix of vector $u$
m	Mass of the aircraft
$I^{B}$	Inertia matrix of the aircraft expressed in the body frame $B$
g	Gravity constant in [m/ $s^{2}$ ]

Table 3. System parameters. The values of the parameters

I^{B}

C_{T}

C_{Q}

C_{A}

C_{E}

, and

C_{R}

were determined in [49].

Table 3. System parameters. The values of the parameters

I^{B}

C_{T}

C_{Q}

C_{A}

C_{E}

, and

C_{R}

were determined in [49].

Parameter Name	Parameter Symbol	Value	[Unit]
Aircraft total mass	m	$3.0$	[kg]
Aircraft moment of inertia matrix	$I^{B}$	$diag (0.1, 0.1, 0.1)$	$[kg m^{2}]$
Aircraft wing span	b	$2.0$	[m]
Fuselage length	-	$1.2$	[m]
Fuselage height	-	$0.24$	[m]
Location of propeller 1 center, see Figure 5	$l_{1}$ , $h_{1}$	$0.16$ , $0.05$	[m]
Location of propeller-tilt joints 2 and 4 in the ( $e_{x}, e_{z}$ ) plane, see Figure 5	$l_{0}$ , $h_{0}$	$0.105$ , $0.015$	[m]
Distance from fuselage longitudinal $e_{x}$ axis and the propeller centers in the $e_{x}, e_{y}$ plane, see Figure 4	$L_{0}$	$0.24$	[m]
Vector between the CoM and application point of aerodynamic force on the left wing or right wing, respectively, see (6) and (7)	$r_{W_{l}}^{B}$ , $r_{W_{r}}^{B}$	${[0, \pm 0.5, - 0.015]}^{⊤}$	[m]
Vector between the CoM and application point of aerodynamic force on the fuselage, see (6) and (7)	$r_{F}^{B}$	${[0.036, 0, - 0.015]}^{⊤}$	[m]
Vector between the CoM and application point of aerodynamic force on the tail vertical element, see (6) and (7)	$r_{T_{v}}^{B}$	${[- 0.71, 0, - 0.04]}^{⊤}$	[m]
Vector between the CoM and application point of aerodynamic force on the tail horizontal element, see (6) and (7)	$r_{T_{h}}^{B}$	${[- 0.71, 0, - 0.015]}^{⊤}$	[m]
Surface of left and right wings, respectively, and fuselage	$S_{W_{l}}$ , $S_{W_{r}}$ , $S_{F}$	$0.21$ , $0.21$ , $0.055$	$[m^{2}]$
Surface of vertical or horizontal part of the tail, respectively	$S_{T_{v}}$ , $S_{T_{h}}$	$0.074$ , $0.047$	$[m^{2}]$
Propeller thrust coefficient	$C_{T}$	$1.11 \times 10^{- 5}$	$[N s^{2} {rad}^{- 2}]$
Propeller drag coefficient	$C_{Q}$	$1.99 \times 10^{- 7}$	$[N m s^{2} {rad}^{- 2}]$
Aileron, elevator, and udder aerodynamic torque coefficients	$C_{A}$ , $C_{E}$ , $C_{R}$	$0.05$ , $0.047$ , $0.047$	$[m^{3}]$
Maximum controller-requestable thrust	$T_{max}$	48	[N]
Maximum controller-requestable torque value	$M_{max}$	2	[N m]
Maximum propeller-tilt rate	${\dot{χ}}_{max}$	$\frac{π}{2}$	$[rad s^{- 1}]$

Table 4. Training for the optimization of the architecture. The rows highlighted in green identify the architecture with the smallest RMSE on the test set among the neural networks with the same number of inner layers.

Main NN
Architecture	Epochs	Training Time	Learning Rate	RMSE
[128]	5	67 min	0.001	0.05503
[256]	5	66 min	0.001	0.05492
[512]	5	69 min	0.001	0.05487
[64]-[64]	5	81 min	0.001	0.04732
[128]-[128]	5	87 min	0.001	0.04314
[256]-[128]	5	88 min	0.001	0.04087
[256]-[256]	5	90 min	0.001	0.04108
[32]-[32]-[32]	5	81 min	0.001	0.04071
[64]-[64]-[64]	5	82 min	0.001	0.04083
[128]-[64]-[32]	5	84 min	0.001	0.03934
[128]-[128]-[128]	5	90 min	0.001	0.03922
[16]-[16]-[16]-[16]	5	85 min	0.001	0.03485
[32]-[32]-[32]-[32]	5	89 min	0.001	0.03496
[64]-[64]-[64]-[64]	5	91 min	0.001	0.03364
[128]-[128]-[64]-[32]	5	94 min	0.001	0.03345
[128]-[64]-[32]-[16]	5	95 min	0.001	0.03299
[128]-[128]-[128]-[128]	5	97 min	0.001	0.03293
Tilt NN
Architecture	Epochs	Training Time	Learning Rate	RMSE
[128]	5	37 min	0.001	0.10994
[256]	5	36 min	0.001	0.11011
[512]	5	39 min	0.001	0.10700
[64]-[64]	5	70 min	0.001	0.09733
[128]-[128]	5	75 min	0.001	0.09899
[256]-[128]	5	80 min	0.001	0.09101
[256]-[256]	5	87 min	0.001	0.09366
[32]-[32]-[32]	5	84 min	0.001	0.06477
[64]-[64]-[64]	5	86 min	0.001	0.06566
[128]-[64]-[32]	5	87 min	0.001	0.06302
[128]-[128]-[128]	5	92 min	0.001	0.06571
[16]-[16]-[16]-[16]	5	73 min	0.001	0.05996
[32]-[32]-[32]-[32]	5	75 min	0.001	0.05792
[64]-[64]-[64]-[64]	5	86 min	0.001	0.05871
[128]-[128]-[64]-[32]	5	88 min	0.001	0.05634
[128]-[64]-[32]-[16]	5	82 min	0.001	0.05612
[128]-[128]-[128]-[128]	5	90 min	0.001	0.05698

Table 5. Training for the optimization of the learning rate. The rows highlighted in blue identify the learning rate delivering the smallest RMSE on the test set among the neural networks with the same number of inner layers.

Main NN
Architecture	Epochs	Training Time	Learning Rate	RMSE
[512]	15	183 min	0.001	0.05479
[512]	15	189 min	0.0005	0.05498
[512]	15	188 min	0.0001	0.05434
[256]-[128]	15	244 min	0.001	0.04034
[256]-[128]	15	248 min	0.0005	0.04129
[256]-[128]	15	254 min	0.0001	0.04083
[128]-[128]-[128]	15	260 min	0.001	0.03804
[128]-[128]-[128]	15	265 min	0.0005	0.03867
[128]-[128]-[128]	15	273 min	0.0001	0.03643
[128]-[128]-[128]-[128]	15	285 min	0.001	0.03287
[128]-[128]-[128]-[128]	15	281 min	0.0005	0.03313
[128]-[128]-[128]-[128]	15	288 min	0.0001	0.03272
Tilt NN
Architecture	Epochs	Training Time	Learning Rate	RMSE
[512]	15	121 min	0.001	0.10643
[512]	15	119 min	0.0005	0.10455
[512]	15	124 min	0.0001	0.10622
[256]-[128]	15	221 min	0.001	0.09017
[256]-[128]	15	235 min	0.0005	0.08516
[256]-[128]	15	246 min	0.0001	0.08451
[128]-[64]-[32]	15	218 min	0.001	0.06237
[128]-[64]-[32]	15	235 min	0.0005	0.06134
[128]-[64]-[32]	15	255 min	0.0001	0.06079
[128]-[64]-[32]-[16]	15	237 min	0.001	0.05565
[128]-[64]-[32]-[16]	15	262 min	0.0005	0.05967
[128]-[64]-[32]-[16]	15	258 min	0.0001	0.05782

Table 6. Imitative learning NN-based controller: parameters of velocity controller, thrust vector attitude controller, and quaternion attitude controller.

Algorithm	Parameter	Value
Velocity controller	$k_{v}$	10 $s^{- 1}$
Thrust vector Attitude controller	$k_{η}$	20 $s^{- 1}$
	$k_{ψ}$	10 $s^{- 1}$
	$K_{ω}$	$diag [10, 10, 10]$ ${rad}^{- 1}$
Quaternion Attitude controller	$(k_{ϕ}, k_{θ}, k_{ψ})$	$(8.0, 6.5, 2.8)$ $s^{- 1}$
	$(k_{p, ω_{x}}, k_{p, ω_{y}}, k_{p, ω_{z}})$	$(0.35, 2, 0.6)$ N m s
	$(k_{d, ω_{x}}, k_{d, ω_{y}}, k_{d, ω_{z}})$	$(0.001, 0.005, 0)$ N m $s^{2}$
	$(k_{i, ω_{x}}, k_{i, ω_{y}}, k_{i, ω_{z}})$	$(0.005, 0.05, 0.05)$ N m

Table 7. Mean absolute velocity tracking error (MAVTE) [m/s] for

{MATLAB}^{®}

simulations on the velocity ramp trajectory with the NN-based controller. The row highlighted in red indicates the NN-based controller with the smallest MAVTE. The bold values indicate the cases where the NN-MAVTE is smaller than the corresponding MPC-MAVTE values.

Table 7. Mean absolute velocity tracking error (MAVTE) [m/s] for

{MATLAB}^{®}

Architecture Main NN	Architecture Tilt NN	RMSE Main NN	RMSE Tilt NN	MAVTE Ramp
[512]	[512]	0.05434	0.10455	4.73
[512]	[256]-[128]	0.05434	0.08451	4.28
[512]	[128]-[64]-[32]	0.05434	0.06079	12.18
[512]	[128]-[64]-[32]-[16]	0.05434	0.05565	2.39
[256]-[128]	[512]	0.04034	0.10455	1.23
[256]-[128]	[256]-[128]	0.04034	0.08451	0.92
[256]-[128]	[128]-[64]-[32]	0.04034	0.06079	0.88
[256]-[128]	[128]-[64]-[32]-[16]	0.04034	0.05565	0.68
[128]-[128]-[128]	[512]	0.03643	0.10455	1.18
[128]-[128]-[128]	[256]-[128]	0.03643	0.08451	0.89
[128]-[128]-[128]	[128]-[64]-[32]	0.03643	0.06079	0.73
[128]-[128]-[128]	[128]-[64]-[32]-[16]	0.03643	0.05565	0.67
[128]-[128]-[128]-[128]	[512]	0.03272	0.10455	1.20
[128]-[128]-[128]-[128]	[256]-[128]	0.03272	0.08451	0.73
[128]-[128]-[128]-[128]	[128]-[64]-[32]	0.03272	0.06079	0.81
[128]-[128]-[128]-[128]	[128]-[64]-[32]-[16]	0.03272	0.05565	0.69

Table 8. Computational time during real flight tests of each control loop for MPC controller vs. NN-based controller, running on an Intel UpBoard featuring an Atom x5-Z8350 (4 × 1.44 GHz) processor with Ubuntu 18.04 LTS.

Metric	MPC Controller	NN-Based Controller
Average time (ms)	24	6
Maximum time (ms)	38	11

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Ducard, G.; Carughi, G. Neural Network Design and Training for Longitudinal Flight Control of a Tilt-Rotor Hybrid Vertical Takeoff and Landing Unmanned Aerial Vehicle. Drones 2024, 8, 727. https://doi.org/10.3390/drones8120727

AMA Style

Ducard G, Carughi G. Neural Network Design and Training for Longitudinal Flight Control of a Tilt-Rotor Hybrid Vertical Takeoff and Landing Unmanned Aerial Vehicle. Drones. 2024; 8(12):727. https://doi.org/10.3390/drones8120727

Chicago/Turabian Style

Ducard, Guillaume, and Gregorio Carughi. 2024. "Neural Network Design and Training for Longitudinal Flight Control of a Tilt-Rotor Hybrid Vertical Takeoff and Landing Unmanned Aerial Vehicle" Drones 8, no. 12: 727. https://doi.org/10.3390/drones8120727

APA Style

Ducard, G., & Carughi, G. (2024). Neural Network Design and Training for Longitudinal Flight Control of a Tilt-Rotor Hybrid Vertical Takeoff and Landing Unmanned Aerial Vehicle. Drones, 8(12), 727. https://doi.org/10.3390/drones8120727

Article Menu

Neural Network Design and Training for Longitudinal Flight Control of a Tilt-Rotor Hybrid Vertical Takeoff and Landing Unmanned Aerial Vehicle

Abstract

1. Introduction

1.1. Context

1.2. Related Work

1.2.1. Combined Flight Mode-Dependent Controllers

1.2.2. Unified Control Approaches

1.2.3. Imitative Learning Approach

1.3. The Control Approach of This Paper and Contributions of This Research

2. Aerial Vehicle Description

2.1. Conventions and Nomenclature

2.1.1. Coordinate Frames

2.1.2. Notation

2.1.3. Actuators

2.2. Center of Mass Dynamics

2.2.1. Rotor Forces and Moments

2.2.2. Aerodynamic Forces and Moments

2.2.3. Control Surface Aerodynamic Torques

2.3. Vehicle and Hardware Description

3. NN-Based Flight Controller Design Methodology

4. Step 1: Design of an MPC-Based Teacher Controller

4.1. MPC State and Input Vectors and Constraints

4.2. MPC Output Definition

4.3. Reference Generation

4.4. Quaternion Attitude Controller

4.5. Control Allocation

5. Step 2: Imitative Learning Neural Network Controller

5.1. Motivation

5.2. Architecture of the NN-Based Flight Controller

5.2.1. Conversion Block

5.2.2. Thrust Vector Attitude Controller

5.2.3. Thrust Correction Block

5.3. Data Generation

5.3.1. Limitations

5.3.2. Chosen Trajectories for Training

5.3.3. Data Preprocessing: Smoothing

5.3.4. Data Preprocessing: NN Dataset Definition

5.4. Neural Network-Based Control Architecture

5.5. Neural Network Layout and Training

5.5.1. Feedforward vs. Recursive NN Layout

5.5.2. Choice of Activation Functions

5.5.3. Training Implementations

First Training Series

Scond Training Series

5.6. NN-Based Controller Output Definition

6. Results

6.1. Tracking Performance and Control Architecture

6.2. Controller Execution Frequencies

6.3. Performance Indicator Definition

6.4. Test Trajectory Definition

6.5. Performance Indicator of the MPC Controller on the Test Trajectory

6.6. Performance Indicator of the NN-Based Controller on the Test Trajectory

6.6.1. Discussion About the NN Architectures

Single-Layer NN Architectures

Multiple-Layer NN Architectures

Conclusions

6.6.2. Discussion of the Impact of Changes in the UAV’s Physical Dimensions on the NN-Based Controllers

6.7. Simulation Results

6.7.1. Context and Expected Results

6.7.2. Comparison Between MPC and NN-Based Controller

6.7.3. Constraint Handling

6.8. Experimental Results

6.8.1. Real Flight Test Setup and Trajectories

Acceleration Phase

Altitude Velocity Tracking

Deceleration

Lateral Path Tracking

6.8.2. Computational-Cost of MPC vs. NN-Based Controller

Recorded Performance in Real Tests

Optimization Aspects

7. Conclusions

8. Outlook

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations