Nothing Special   »   [go: up one dir, main page]

Next Article in Journal
A Flexible Printed Circuit Board Based Microelectromechanical Field Mill with a Vertical Movement Shutter Driven by an Electrostatic Actuator
Previous Article in Journal
End-to-End Encrypted Message Distribution System for the Internet of Things Based on Conditional Proxy Re-Encryption
You seem to have javascript disabled. Please note that many of the page functionalities won't work as expected without javascript enabled.
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Research on Multiple-AUVs Collaborative Detection and Surrounding Attack Simulation

1
Xi’an Precision Machinery Research Institute, Xi’an 710077, China
2
School of Astronautics, Northwestern Polytechnical University, Xi’an 710072, China
3
School of Marine Science and Technology, Northwestern Polytechnical University, Xi’an 710072, China
*
Author to whom correspondence should be addressed.
Sensors 2024, 24(2), 437; https://doi.org/10.3390/s24020437
Submission received: 1 December 2023 / Revised: 5 January 2024 / Accepted: 7 January 2024 / Published: 10 January 2024
Figure 1
<p>A complete simulation process of multi-AUVs formation execution mission. The simulation process delineates the complete execution of a multi-AUV formation mission. The diagram bifurcates into two segments. The left side portrays the AUV fleet engaging in a long-range transit, moving from the starting point towards the target point, preparing for the collaborative search phase. On the right side, subsequent to reaching the target point, the AUV fleet initiates a collaborative search employing the optimized formation. Upon target detection, the fleet predicts the target’s trajectory, facilitating interception.</p> ">
Figure 2
<p>The proposed method consists of long-range ferry, collaborative detection, target trajectory prediction, and collaborative surrounding four stages.</p> ">
Figure 3
<p>RRT flowchart. In the initial phase of the algorithm, RRT designates the starting point as the root node and then randomly samples a point <math display="inline"> <semantics> <mrow> <msub> <mi>q</mi> <mrow> <mi>r</mi> <mi>a</mi> <mi>n</mi> <mi>d</mi> </mrow> </msub> </mrow> </semantics> </math>. Subsequently, RRT identifies the nearest node <math display="inline"> <semantics> <mrow> <msub> <mi>q</mi> <mrow> <mi>n</mi> <mi>e</mi> <mi>a</mi> <mi>r</mi> </mrow> </msub> </mrow> </semantics> </math>, in the tree structure and establishes a connection between <math display="inline"> <semantics> <mrow> <msub> <mi>q</mi> <mrow> <mi>n</mi> <mi>e</mi> <mi>a</mi> <mi>r</mi> </mrow> </msub> </mrow> </semantics> </math> and <math display="inline"> <semantics> <mrow> <msub> <mi>q</mi> <mrow> <mi>r</mi> <mi>a</mi> <mi>n</mi> <mi>d</mi> </mrow> </msub> </mrow> </semantics> </math>. If the path between <math display="inline"> <semantics> <mrow> <msub> <mi>q</mi> <mrow> <mi>n</mi> <mi>e</mi> <mi>w</mi> </mrow> </msub> </mrow> </semantics> </math> and <math display="inline"> <semantics> <mrow> <msub> <mi>q</mi> <mrow> <mi>r</mi> <mi>a</mi> <mi>n</mi> <mi>d</mi> </mrow> </msub> </mrow> </semantics> </math> does not intersect with obstacles, RRT adds <math display="inline"> <semantics> <mrow> <msub> <mi>q</mi> <mrow> <mi>n</mi> <mi>e</mi> <mi>w</mi> </mrow> </msub> </mrow> </semantics> </math> to the tree. During each iteration, RRT checks if the new node has reached the goal. If so, RRT generates the final path and concludes the algorithm. This iterative process involves continuously expanding new nodes in the tree to rapidly explore feasible paths until reaching the specified number of iterations or satisfying the termination conditions.</p> ">
Figure 4
<p>The RRT (Fast Random Tree Method) planning path. The red point signifies the target point, the blue point represents the AUV, and the blue curve out-lines the planned trajectory. The gray area denotes the obstacle zone with a weight of 0.5, while the red area signifies the obstacle area with a weight of 0.9.</p> ">
Figure 5
<p>Artificial potential field obstacle avoidance principle. The three black dots symbolize the obstacles, with the fan-shaped area representing the AUV’s detection range. The three black arrows illustrate the repulsive forces exerted by the obstacles on the AUV, while the red arrow represents the combined force resulting from these three repulsive forces.</p> ">
Figure 6
<p>Formation structure. D represents the communication distance between AUVs.</p> ">
Figure 7
<p>Environment probabilities. After the AUV formation traverses a specific area, the information entropy map is updated based on the detection information gathered by the AUV fleet. A probability of 1 or 0 indicates that the information about the target in that region is certain, signifying the presence or absence of a target, respectively. A probability of 0.5 signifies the highest level of uncertainty in the information for that area, representing a completely unknown state. The dashed areas signify more distant environmental regions.</p> ">
Figure 8
<p>Target localization for cooperative detection tasks based on Shannon entropy. The red arrow indicates the direction in which the target is more likely to move, and the blue arrow indicates the direction in which the target is less likely to move.</p> ">
Figure 9
<p>Single-layer LSTM structure diagram.</p> ">
Figure 10
<p>The input and the output of the LSTM network.</p> ">
Figure 11
<p>Schematic diagram of collaborative rounding.</p> ">
Figure 12
<p>Simultaneous approach strategy in the limited communication environment.</p> ">
Figure 13
<p>AUV formation surrounding attack reward map.</p> ">
Figure 14
<p>The RRT algorithm plots 4 paths in different situations. To assess the efficacy of the Rapidly Exploring Random Tree (RRT) algorithm, various task scenarios were devised for reliability verification. The red point signifies the target point, the blue point represents the AUV, and the blue curve outlines the planned trajectory. The gray area denotes the obstacle zone with a weight of 0.5, while the red area signifies the obstacle area with a weight of 0.9. In (<b>a</b>), the target occupies the upper left, with the Unmanned Underwater Vehicle (UUV) situated in the lower left. In (<b>b</b>), the target is positioned in the upper right, while the UUV remains in the lower left. (<b>c</b>) depicts the target in the upper left, and the UUV in the lower left. Finally, in (<b>d</b>), the target is located in the upper right, and the UUV is in the lower right. The consistent success across diverse scenarios underscores the robust applicability of the RRT algorithm.</p> ">
Figure 15
<p>Two possible topological structures of the formation. (<b>a</b>) Each AUV establishes a communication relationship with its two connected AUVs, and the entire communication network is in a stable state; (<b>b</b>) AUV III only establishes a communication relationship with AUV I, so while AUV I, AUV II, and Main AUV can communicate stably, AUV III is unstable in the communication network.</p> ">
Figure 16
<p>Parametric model of the formation structure.</p> ">
Figure 17
<p>Schematic diagram of formation detection.</p> ">
Figure 18
<p>Iterative diagram of the target.</p> ">
Figure 19
<p>Optimized formation diagram. The four red circles represent the optimal locations for each of the four underwater vehicles.</p> ">
Figure 20
<p>Navigation detection of AUV formations. This figure illustrates the AUV formation aligned with the optimized configuration for collaborative detection tasks. The dotted green lines symbolize the communication chain. The AUVs sporting blue heads acting as followers, while the blue lines depict the paths of the followers. The leader, distinguished by a red head, is denoted by solid red lines outlining its trajectory.</p> ">
Figure 21
<p>The prediction effectiveness obtained by the LSTM network on the conditions of different target maneuver angle. (<b>a</b>) The prediction effect of the LSTM network under the condition that the maneuver Angle of the target is 15 degrees, the orange dots and the blue dots correspond to the predicted and actual positions of the target respectively; (<b>b</b>) The prediction effect of the LSTM network under the condition that the maneuver Angle of the target is 30 degrees, the orange dots and the blue dots correspond to the predicted and actual positions of the target respectively; (<b>c</b>) The prediction effect of the LSTM network under the condition that the maneuver Angle of the target is 60 degrees, the orange dots and the blue dots correspond to the predicted and actual positions of the target respectively; (<b>d</b>) The prediction effect of the LSTM network under the condition that the maneuver Angle of the target is 90 degrees, the orange dots and the blue dots correspond to the predicted and actual positions of the target respectively.</p> ">
Figure 22
<p>Multi-AUVs formation surrounding success rate.</p> ">
Figure 23
<p>The process of AUV surrounding targets. This figure depicts the three orange AUVs completing a process for surrounding black target. The blue tracks outline their trajectories, while the red line shows the target’s motion.</p> ">
Figure 24
<p>The AUV trajectory without using LSTM prediction on the condition of 8 knot and 60 degrees where the target is set as purple. The blue, orange, red, and green tracks depict the motion trajectories of the four AUVs.</p> ">
Figure 25
<p>The AUV trajectory using LSTM prediction on the condition of 8 knot and 60 degrees where the target is set as purple. The blue, orange, red, and green tracks depict the motion trajectories of the four AUVs.</p> ">
Versions Notes

Abstract

:
Due to limitations in operational scope and efficiency, a single Autonomous Underwater Vehicle (AUV) falls short of meeting the demands of the contemporary marine working environment. Consequently, there is a growing interest in the coordination of multiple AUVs. To address the requirements of coordinated missions, this paper proposes a comprehensive solution for the coordinated development of multi-AUV formations, encompassing long-range ferrying, coordinated detection, and surrounding attack. In the initial phase, detection devices are deactivated, employing a path planning method based on the Rapidly Exploring Random Tree (RRT) algorithm to ensure collision-free AUV movement. During the coordinated detection phase, an artificial potential field method is applied to maintain AUV formation integrity and avoid obstacles, dynamically updating environmental probability based on formation movement. In the coordinated surroundings attack stage, predictive capabilities are enhanced using Long Short-Term Memory (LSTM) networks and reinforcement learning. Specifically, LSTM forecasts the target’s position, while the Deep Deterministic Policy Gradient (DDPG) method controls AUV formation. The effectiveness of this coordinated solution is validated through an integrated simulation trajectory.

1. Introduction

The unmanned, intelligent, multi-functional, and adaptable traits of AUVs have garnered significant global attention, establishing them as crucial carrier platforms for executing underwater missions [1]. Given the intricacies of the underwater environment and the escalating mission requirements, relying on a single AUV becomes challenging. Consequently, collaborative task execution by multiple AUVs emerges as an inevitable trend [2]. Compared to individual AUVs, multiple AUVs demonstrate superior adaptability and responsiveness when facing complex underwater environments and advanced tasks [3]. The advantages of multiple AUV collaborative systems are evident, not only expanding perceptual range and enhancing efficiency but also finding applications in diverse areas such as ocean data collection, seabed exploration, and underwater target search [4]. Therefore, in underwater environments, integrating multiple AUV systems is indispensable for future AUV research [5].
Recent years have witnessed comprehensive developments in various technical aspects of multi-AUV formations, including research on path planning, formation control, and cooperative capture. For instance: Zhang et al. [6] proposed an AMPSO algorithm for the three-dimensional path planning of multiple AUVs, aiming to enhance the speed and exploration capability of autonomous path planners for multiple AUVs. Hu et al. [7] proposed a Formation Comprehensive Cost (FCC) model to achieve collision avoidance within a formation by balancing convergence time, transformation distance, and sensor network power consumption. Qin et al. [8] achieved collision-free and coordinated motion for each AUV by integrating master-slave formation control methods with a path planning strategy based on artificial potential fields. Wen et al. [9] proposed a novel controller, composed of an improved artificial potential field, graph rigidity theory, and affine transformation (IAPF-GR-AT), to achieve real-time formation reconstruction and obstacle avoidance for AUV formations in a three-dimensional marine environment. Huang et al. [10] proposed a distance-based negotiation approach to control AUV formations to perform target encirclement. Liang et al. [11] successfully addressed the challenge of coordinated encircling of targets by AUV formations with varying motion capabilities through the application of a heuristic neural network approach. Cao et al. [12] proposed a multi-AUV search algorithm based on dynamic prediction of the target’s motion trajectory. This allows AUV formation members to swiftly reach the desired hunting points, facilitating efficient pursuit and capture of the target. Petritoli et al. [13] demonstrated the behavioral simulation of a dispersed fleet of underwater unmanned submersibles in a confined maritime area. In summary, AUV collaborative formation research has a wide range of fields, such as path planning, formation control, and formation encirclement.
However, existing research on AUV formation tasks tends to be fragmented, lacking a comprehensive process for the entire execution of AUV formation missions. This paper addresses this gap by presenting a complete process for simulating AUV formation tasks, divided into four stages: The first simulation stage is the long-range ferry phase, the AUV formation sails to the designated area according to the path planned by the RRT. The second simulation stage is the collaborative search phase. Once the formation reaches the suspicious target area, the AUV formation collaboratively searches for the target. The third simulation stage is the cooperative surrounding attack phase. In this stage, the AUV formation predicts the target trajectory using the LSTM method. The fourth simulation stage is the capture phase, the AUV formation employs the DDPG method to control individual AUVs for target capture. This comprehensive research framework has the potential to provide valuable guidance and insights into the integrated execution of AUV formation missions.
The innovations of this paper are as follows:
  • The entire process of AUV formation execution and integrating methods is developed for four different simulation stages of AUV formation;
  • The artificial potential field method is effectively employed to calculate obstacle avoidance waypoints for the AUV formation;
  • The utilization of LSTM neural networks efficiently predicts the motion trajectories of targets, and the DDPG method is introduced for AUV formation control and a high success rate of the surrounding attack can be achieved.
The remaining sections of this paper are as follows: Section 2 introduces the detailed methods that address the requirements of the entail task, including the AUV motion equations, path planning algorithm, dynamic obstacle avoidance method, target state trajectory prediction method, and collaborative rounding method. Then in Section 3, the entire collection of AUV forms is simulated. The conclusion is presented in Section 4.

2. Methodology

Figure 1 illustrates the configuration of the multi-AUV formation, initially arranged linearly from the starting position. The formation utilizes the RRT algorithm to navigate around static obstacles, ultimately reaching the designated area. Subsequently, the formation transitions into a diamond configuration, employing the artificial potential field method to detect suspicious areas while accounting for communication limitations. Upon successful target detection, the LSTM algorithm facilitates dynamic trajectory prediction. The AUV formation initiates a search to identify optimal interception positions. Upon locating a defined position, the formation converges and engages the AUV to fulfill the assigned mission. The algorithm framework for the entire task is depicted in Algorithm 1.
To meet the task requirements outlined in Figure 1, this section, illustrated in Figure 2, breaks down the comprehensive mission requirements into four categories: Remote Ferries, Collaborative Detection, Target Trajectory Prediction, and Collaborative Environments. More precisely, we apply the RRT approach to fulfill the mission requirements for long-distance ferries. Artificial potential field methods are employed for collaborative detection. LSTM is utilized for predicting target trajectories, and DDPG methods are implemented to address collaboration-related requirements.
Algorithm 1: Multiple-AUVs Collaborative Detection and Surrounding attack Simulation
/*Initialization*/
(01)
E n v     Construct virtual environment;

/*First Stage*/
(02)
  S t a r g e t       Sample   randomly   ( E n v , 1);
(03)
  S c o n s       Obstacle information;
(04)
 For k = 1: K/*K is the number of the AUVs*/
(05)
   S k       Sample   randomly   ( E n v , k);
(06)
   F e r r y   p a t h k         RRT   ( S t a r g e t ,   S k ,   S c o n s );
(07)
 End For

/* Second Stage */
(08)
 /* Determine the topology of AUVs formation */
(09)
  θ m o d e l     Build parametric formation model;
(10)
  O b j         Construct   a   multi - objective   optimization   problem   ( E n v , θ m o d e l );
(11)
  θ m o d e l *       Optimize   based   on   PSO   and   GA   ( θ m o d e l ,   O b j );
(12)
  S e a r c h   p a t h       Generation   path   ( θ m o d e l * );

/* Third Stage */
(13)
  E n v s u r r o u n d         Build   training   environment   ( E n v ,   S t a r g e t );
(14)
 D       Construct   training   data   ( E n v s u r r o u n d , Batch size);
(15)
  L S T M   m o d e l     Training model (D);
(16)
 For k = 1: K
(17)
     N e t a c t o r k       DDPG   ( S t a r g e t ,   S k ,   L S T M   m o d e l );
(18)
 End For
(19)
  S u r r o u n d   p a t h       Co - Simulate   ( N e t a c t o r );

2.1. AUV Motion Equations

Before establishing the entire task model, the AUV motion equation system is built according to the momentum theorem and the momentum moment theorem, as stated in the definition of reference [14].
x ˙ y ˙ z ˙ = C 0 b v x v y v z
v ˙ x v ˙ y v ˙ z ω ˙ x ω ˙ y ω ˙ z = A m   1 f s + f g + f t + f t l m s + m g + m t
θ ˙ φ ˙ ψ ˙ = 0 cos φ / cos θ sin φ / cos θ 0 sin φ cos φ 1 cos φ tan θ sin φ tan θ w x w y w z
The Equations (1)–(3) together form a set of AUV spatial kinematic equations comprising 12 equations. x   y   z represent the displacement of the AUV in three directions within the inertial coordinate system. v x   v y   v z signifies the AUV’s velocity in three directions within the body coordinate system. w x   w y   w z denotes the rotational angular velocities of the AUV around the x, y, and z axes in the body coordinate system. C 0 b is the matrix for transforming the body coordinate system to the inertial coordinate system. A m denotes the added mass matrix. f s and f g signify the forces exerted by the fluid and the buoyant forces on the AUV, respectively. f t is the force transformed by mass-based coordinates. f t l indicates the propeller thrust. m s is the moment of hydrodynamic force acting on the AUV. m g represents the gravitational torque acting on the AUV. m t illustrates the moment (torque) transformed by mass-based coordinates. θ , ψ , and φ denote the tilt angle, the yaw angle, and the roll angle of the AUV. For specific information regarding coordinate system definitions and parameters, please refer to [14].

2.2. Path Planning

The underwater realm is characterized by intricate, and unevenly distributed complex environments. In addressing this, a practical and efficient motion planning approach is introduced based on RRT. This technique involves generating collision-free trees through random processes, enabling thorough exploration of the vicinity surrounding an AUV.
As depicted in Figure 3, the path-planning method that relies on sampling is commonly known as the RRT algorithm [15]. A tree originates from its root node and extends branches, eventually reaching endpoints. At this stage, a distinctive path can be traced back to the root node. The process begins by initializing the tree and defining the start and end points. Subsequently, a sampling function is executed, where the endpoints are selected as sampling points with a predetermined probability. This sample point explores the neighboring region to identify the nearest node, at which the tree is further expanded. A line is drawn to connect the sample point and its closest node. The existing branch is discarded if it intersects with an obstacle, and a fresh sampling is conducted to identify a new nearest node, thus ensuring obstacle avoidance. As the sampling process progresses, a new node is created, and all existing nodes including the branch nodes and the root node are considered together. This cumulative set of nodes guides the ongoing sampling. This process continues to advance and gradually approaches the endpoint. The distance from the node to the endpoint is calculated to determine whether the node can effectively reach the target point. Additionally, it evaluates whether the step length is reasonable and devoid of obstacles. Subsequently, a path is drawn between the node and the target point to ensure a connection.
As depicted in Figure 4, obstacles exhibit distinct starting and ending positions. The blue dot indicates the position of the vehicle, while the gray regions indicate obstacle zones with values between 0.5 and 0.9. The red area signifies obstacle locations with a weight exceeding 0.9. The blue curve delineates the path devised by the RRT algorithm.

2.3. Dynamic Obstacle Avoidance

During this mission phase, the task area is occupied by continuously moving noise source obstacles denoted as O j . These obstacles are strategically placed by the opponent to intentionally disrupt operations within the designated area. In alignment with the mission’s directives, the artificial potential field approach is used to address dynamic obstacle avoidance. We assign each obstacle a separate force field. Specifically, when an obstacle falls within the detection range of A i (the ith AUV), the associated potential field exerts a repulsive influence. The formula for calculating the force is as follows:
F i j = α 1 D i j 1 D 0 1 D i j 2
where D i j means the distance between A i and O j , D 0 means the safety distance, and α is the repulsion coefficient.
The obstacle avoidance repulsion can be obtained by summing the repulsive forces of each obstacle O j , then the expected acceleration of A i can be calculated by:
F i = j F i j = m v ˙
As depicted in Figure 5, three dynamic obstacles within the sector detection range are assigned individual potential fields. Varying forces are applied to the vehicle based on the distance from each obstacle. The cumulative force represents the vehicle’s obstacle avoidance repulsion force. By introducing the distance into Equation (4), the resultant force of the force field can be calculated on the AUV. Furthermore, to convert the resultant force into the control quantity, the expected acceleration of the AUV can be obtained by using Newton’s second law (Formula (5)).

2.4. Cooperative Detection

The formation of AUVs can greatly enhance the efficiency and accuracy of search operations [16]. Therefore, to fulfill the mission objectives, a strategy is implemented for involving the formation of four AUVs. A transverse diamond formation with an acute angle of 60° is configured and the maximum communication distance is 2000 m. In this arrangement, the Leader AUV is positioned at the rear, as illustrated in Figure 6.
The environment is divided into multiple grids C k , each grid is assigned a probability value representing the probability that the formation judges the presence of the target [17]. This probability value is derived from the environment probability obtained by each formation member through exploration P i k . Followers transmit the data to the leader, and then the leader synthetically calculates the environmental probabilities Q k .
The Shannon entropy is employed as a metric to quantify the level of uncertainty within an area during the exploration process:
χ k = Q k l o g Q k + 1 Q k l o g 1 Q k l o g 2
As depicted in Figure 7, based on the given expression, probabilities tend to be categorized as “certain” when they are near 0 or 1. In such cases, the resulting Shannon entropy, which is calculated by Formula (6), is low. Conversely, probabilities close to 0.5 indicate the highest level of uncertainty, resulting in the highest Shannon entropy value.
As depicted in Figure 8, after a mission cycle, when the formation transitions into cruise mode, it employs a random selection from eight potential directions. The formation then moves toward the region that is furthest away and exhibits the highest Shannon entropy in the chosen direction. This criterion requires continuous updating of cruise routes, in order to obtain comprehensive global knowledge.
The path originating from the lower right corner signifies the concluding work cycle. During this phase, a new direction is selected toward the bottom left once the formation has successfully reached the target area and completed the mission, indicated by the red arrow. Along this chosen path, the maximum entropy value is 1.0, and the location with the highest entropy value of 1 is selected as the target position for the subsequent task cycle, which is represented by the red grid.

2.5. Target State Prediction Method Based on LSTM

The Long Short Memory Network (LSTM) [18] primarily addresses the challenge of long-term dependencies encountered in conventional Recurrent Neural Networks (RNNs). This type of network is commonly employed for tasks such as classification and prediction [19].
The mission requirements outlined in this paper pose a significant challenge for the communication and detection of AUV formations. Recognizing LSTM’s generalization and processing capabilities in handling irregular data as an effective time sequences processing tool, we incorporate it as the target trajectory prediction method in the third stage within the task. What is more, the AUV formation cannot continuously detect the target trajectory under some working conditions, so LSTM is needed to predict the target position in order to detect the target trajectory.
In Figure 9, the entire network comprises a forgetting gate f t , an input gate i t , and an output gate o t . By using Formula (7), the network takes the target’s current motion state as input, and then processes it to predict and determine the future motion state of the target. The calculation formula is as follows:
i t = σ ( W i · [ h t 1 , x t ] + b i ) f t = σ ( W f · [ h t 1 , x t ] + b f ) C t = tanh ( W C   · [ h t 1 , x t ] + b C ) o t = σ ( W o · [ h t 1 , x t ] + b o ) C t = f t C t 1 + i t C t h t = o t tanh ( C t )
In Figure 10, a target exhibiting distinct motion characteristics possesses a connection between its current state and its future state.

2.6. Collaborative Rounding Method Based on DDPG

2.6.1. DDPG

DDPG (Deep Deterministic Policy Gradient) is introduced to tackle challenges in continuous action control problems. Traditional algorithms like Q-learning, Sarsa, and DQN are designed for discrete action spaces. DDPG is an extension of the DQN algorithm, enabling it to address the complexities of continuous action control [20].
The target motion data and the AUV state variables are used as inputs to the action network. The network then generates control variables for the AUVs to interact with the virtual environment. Considering the task objectives, a reward function is designed to guide the training process and motivate the AUV to complete the specified tasks in its surrounding environment. The network training pseudo-code is shown in Algorithm 2:
Algorithm 2: DDPG algorithm based on the rounding network training
/*Initialization*/
(01)
Initialize network parameters θ μ of the Critic network Q ( s , a | θ Q ) sand the Actor-network μ ( s | θ μ ) ;
(02)
Copy the parameters of Critic and Actor to the corresponding target network:
(03)
  θ Q θ Q  ⋯  θ μ θ μ ;
(04)
Initialize the Reply Buffer R.

/*Main Loop*/
(05)
For episode = 1:M  /* M is the number of training times*/
(06)
  Initialize a random process N to add noise to the behavior;
(07)
  Initializes the state s 1 ;
(08)
  For t = 1:T  /* T is the max time step in each training process*/
(09)
    Get action based on the current strategy and explore the noise:
(10)
     a t = μ ( s t | θ μ ) + N t ;
(11)
    The output of the action network a t is transformed into the control variable of the AUV and integrated into the dynamic equation to obtain the reward r t and the next state s t + 1 ;
(12)
    Convert states to sequences ( s t , a t , r t , s t + 1 ) and store them in Reply Buffer R;
(13)
    Randomly sampled from the Reply Buffer as training data for the Actor-network and Critic network;
(14)
     y i = r i + Q ( s i + 1 , μ ( s i + 1 | θ μ ) | θ Q )
(15)
    Update Critic network parameters by minimizing the loss function L :
(16)
     L = 1 N i ( y i Q ( s i + 1 | θ μ ) | θ Q ) ) 2
(17)
    Calculate the sample policy gradient θ μ μ | s i , and update the parameters of the Actor-network θ μ :
(18)
     θ μ μ | s i = 1 N i a Q ( s , a | θ Q ) | s = s i , a = μ ( s i ) θ μ ( s | θ μ ) | s = s i
(19)
    Update the target network parameters θ Q and θ μ
(20)
     θ Q τ θ Q + ( 1 τ ) θ Q
(21)
     θ μ τ θ μ + ( 1 τ ) θ μ
(22)
  End For
(23)
End For

2.6.2. Collaborative Rounding Environment

Based on the DDPG approach, a dynamic equation-controlled virtual environment is constructed to train the agent network. This virtual environment facilitates the learning process. The schematic diagram of this setup is presented in Figure 11.
Upon initializing the environment, a task target is assigned to the AUV according to its initial position. To optimize the likelihood of successful circumnavigation, the AUV should approach the target from various directions. Consequently, during the virtual environment’s initialization, each AUV is given a distinct orientation based on previous target position information. Figure 12 illustrates this process, serving as the foundation for establishing the reward function, this will be detailed in the subsequent sections.
During the entire training process, the position information and dynamic properties of AUV are used as state variables at each time step. The output layer of the Actor-network uses a sigmoid activation function. The resulting two output values are subtracted in order to obtain the normalized rudder angle output. This mechanism enables direct control of the AUV through the action network.

2.6.3. Artificial Potential Field Reward

To enhance the success rate of circumnavigation during the training, a potential function is employed as a reward function. This function increases in value as the relative distance between entities decreases.
For the ith AUV, the reward it faces can be expressed as:
R i = + 10 λ ,   i f Δ S i < S min + 10 λ e α Δ S i ,   i f o i T i λ ,   i f w i < lim w
where w i indicates the angular velocity of rotation of AUV at the current time, S i indicates the distance between the AUV and the target, α is an adjustable distance factor, λ is the reward factor. o i indicates the location of the AUV in the environment, T i is an area controlled by two Angle functions: T i [ θ 1 e β 1 Δ S i , θ 2 e β 2 Δ S i ] with two adjustable coefficients θ and β .
By using Formula (8), the reward R is obtained as the training data of the network. As depicted in Figure 13, by implementing the above three-tiered reward settings, the AUVs can be systematically guided to approach the target in the designated direction throughout the training process. The proposed approach also helps minimize sudden rudder angle fluctuations caused by the network output.
Compared to traditional simplistic approaches, our proposed method takes into account underwater communication and AUV dynamic constraints within the specified task scenario. Under the consideration of AUV communication and dynamic constraints, our method demonstrates superior performance in executing the overall task.

3. Simulation Results and Analysis

3.1. The Construction of Multi-Cooperative Task Virtual Environment

In this paper, the assembly of a complete AUV formation to perform a mission is termed a “mission”. Upon reaching the designated target area, multiple AUVs initiate long-range flights towards the predetermined detection area. This detection area is determined based on the platform analysis, and it is also the expected location. Considering the communication limitations and dynamic obstacles of the formation, the AUV formation executes this long-range flight while adhering to a predefined cooperative detection pattern. Upon detecting a target, its trajectory is predicted, facilitating the localization of the AUV formation. Following the establishment of optimal positions, the formation initiates tracking and pursuit once a target is determined. The pursuit commences from the position deemed most favorable. A 50 km × 50 km two-dimensional simulation environment and an AUV motion model are built in MATLAB, with the AUV performing duties at a set depth. The simulation running time is set to T = 10,000. The overall simulation procedure is divided into four phases: path planning, collaborative detection, target trajectory prediction, and surrounding and pursuit. Through these stages, the AUV’s behavior and interactions are systematically replicated in the defined environment.

3.2. Simulation Verification of the Proposed RRT Path Planning Algorithm

Figure 14 shows the designed path of the AUV towards the virtual point, the proposed RRT algorithm can generate reasonable paths in basic obstacle environments, and its effectiveness can be verified. The positions of the target point are randomly assigned. The target point is marked by a red dot, and the AUVs are represented by blue dots. Obstacle areas are indicated by gray areas (with a weight of 0.5) and red areas (with a weight of 0.9 or higher).
The target point and AUVs are positioned diagonally, in order to evaluate the efficiency of the proposed RRT algorithm, and test the performance of the proposed algorithm to navigate these different starting points.
It can be seen from the blue trajectory in Figure 14, the proposed RRT algorithm has a powerful ability to accurately plan a seamless path for the AUV to reach the target point. Moreover, Figure 14 also shows that the path planned by the RRT expertly maneuvers around obstacles of different weights, resulting in a very smooth trajectory.

3.3. Formation Optimization

To find the best formation for four AUVs, it is necessary to find the topological structure. Despite the additional communication relationships caused by the parameters, the preset topological relationships must be determined to achieve formation communication. There must be at least three edges in the formation communication network to connect four AUV nodes. However, considering the dynamic stability of the AUV and to expand the scope of design, one more edge should be added to improve the stability of the communication system. It is easy to see that every AUV can communicate with two other agents when the communication structure is a ring like in Figure 15a. But when it comes to Figure 15b, the AUV with only one communication relationship would be instability.
When solving the formation optimization problem, it is necessary to parameterize the formation structure, so the polar coordinate method is considered. As shown in Figure 16, the polar coordinate system is established with the main AUV as the pole, and the coordinates of the two AUVs are determined by the pole angle and pole diameter. At the same time, the polar coordinate system is established with AUV I as the pole, and the coordinate of the AUV III is determined in the same way. At this time, the constraint of the optimization problem can be given according to the communication angle and distance of AUV II and AUV III.
In addition to the communication constraints determined by the topology, there are still some constraints to be considered in the communication links of the formation. First, to ensure the synchronization of communication, the communication of each node should be as consistent as possible. Secondly, when the communication signals received by a node are too close, it is easy to cause a signal interference. Therefore, it is also necessary to ensure that there is a certain angle between different communication signals.
There is always an inevitable error rate in AUV detection. When the target is covered in multiple AUV detection areas, the success rate can be effectively improved, expressed in the formula as follows.
Ψ = 1 μ N
In Formula (9), Ψ represents the success rate of formation detection, μ represents the error rate, and N represents the number of overlapping detection areas. Therefore, what needs to be considered in formation optimization is not only a larger detection area but also a more accurate detection effect. We define the feature detection area as follows and take it as the optimization goal:
S * = i = 1 N S i Ψ i = i = 1 N S i 1 μ i = S i = 1 N S i μ i
In Formula (10), S i means the area that can be detected by i AUVs, which can be seen in Figure 17. Combined with the communication link consistency agreement mentioned above, we can propose the target function of the optimization problem:
f = ρ 1 S * + ρ 2 R d i = ρ 1 S * + ρ 2 max d i min d i
In Formula (11), ρ 1 means the coefficient of the characteristic area and ρ 2 means the coefficient of the extreme difference of the communication link.
According to the above analysis, we put forward the format of the formation optimization problem and apply the particle swarm algorithm to solve the problem:
min .   f = ρ 1 S * + ρ 2 max d i min d i = f x s . t .   Ρ 0 = 0 0     Ρ 1 = x 2 cos x 1 x 2 sin x 1     Ρ 2 = x 4 cos x 3 x 4 sin x 3     Ρ 3 = Ρ 1 + x 6 cos x 5 x 6 sin x 5     30 ° Ρ i Ρ a , Ρ i Ρ b 150 ° , i a b 0 , 1 , 2 , 3     40 ° Ρ 3 Ρ 2 , o y 140 ° d i = Ρ 1 Ρ 0 , Ρ 2 Ρ 1 , Ρ 3 Ρ 2 , Ρ 0 Ρ 3     50 d i 2000     S * = S i = 1 N S i μ i
In this problem, it is hard to calculate the overlapped area, so we used the Monte Carlo method for approximate calculation and selected the number of samples as 1000. To better balance the optimized goals of detection and communication, we defined the cooperation as follows:
  μ = 0.8 ρ 1 = 10 6 ρ 2 = 10 3
To get a better optimization result, we tested the effect of different algorithms including GA and PSO with different inertia coefficients on this problem in Table 1. After 10,000 times of function evaluation of different algorithms, we got the following optimization results (Figure 18 and Figure 19):
In Figure 19, the three red circles represent the optimized formation of the AUV fleet. Upon reaching the target area, the AUV fleet will reconfigure into the optimized formation, preparing for the next stage of collaborative search.

3.4. Collaborative Detection and Dynamic Obstacle Avoidance

Considering the communication limitations of vehicles, a diamond formation with an internal angle of 60° is adopted in this paper. This formation mode will be applied when the AUV formation is established within the target water area with an initial span of 50 km. If there are hostile interference sources within the task area, the proposed model abstracts them into dynamic point avoidance. The dynamic boundaries are represented by black dots, and the fan-shaped regions depict the sonar detection areas.
When the sonar identifies a dynamic obstacle, the AUV formation immediately begins dynamic obstacle avoidance maneuvers. AUVs need to be able to avoid areas with dynamic obstacle distribution, in order to ensure that the AUV formation is always focused on target detection.
In Figure 20, four AUVs are configured in a diamond shape for target area detection. The red line represents the trajectory of the lead AUV, and the blue lines represent the trajectories of the three sub-AUVs.

3.5. Trajectory Prediction with LSTM Network

In this section, the training method and the prediction effect of the LSTM network are illustrated in detail. In supervised learning methods, the selection of training data directly affects the performance of the trained network. Firstly, the target is independently run several times in the virtual environment to obtain enough target state data to construct the dataset. After that, the data in 11 consecutive time steps are taken as a set of inputs and output, where the first ten are set as inputs and the last one is set as output. The network uses the backpropagation method to update the parameters, and 20% of the dataset is taken as the test set to evaluate the loss function. In addition, during the construction of the training set, the maneuver angle of the target is set to 90 degrees, and 2% Gaussian noise is added.
To enhance the real-time performance of LSTM predictions, the rollout muti-step-ahead (RMS) [21] is applied to predict the target locations in the proposed method, which is expressed as Equation (14):
[ S t a r g e t t - W + 1 , , S t a r g e t t ] M S ^ t a r g e t t + 1 M S ^ t a r g e t t + 2 M , , M S ^ t a r g e t t + U
where S t a r g e t t demonstrates the target location information at the t step, W is set to 10 according to the training data, and M represents the generation method of the state information at the next time. When the target is in a detectable location, the state information at the next time step is filled with the true position of the target; otherwise, the state information is filled with the LSTM prediction position.
Figure 21 illustrates the effectiveness of the LSTM network, which shows that the accuracy of the prediction position obtained by the LSTM network gradually increases with the target moves.
As shown in Figure 21, in the target maneuvering angles of 15, 30, and 60 degrees, the trajectory prediction obtained by the LSTM network is still effective. Different from the condition of 90 degrees which is used in the training set, these results demonstrate the generalization ability of the proposed method. In addition, the mean error of the prediction data is 19.4 m and the standard deviation is 7.2 compared with the real position under the 90-degrees maneuvering condition of the target.

3.6. Simulation Verification of the Target Surrounding Method

Figure 22 shows a graph representing the success rate of target surroundings in various AUV formations. The abscissa represents the target orientation with the initial yaw angle of the AUV formation, and the ordinate represents the success rate of the AUV formation encircling the target. The four curves correspond to standard values for successful surrounding by AUV formations, which are set to 450 m, 600 m, 750 m, and 900 m, respectively. The key observation from Figure 22 is that, no matter where the target is in the AUV formation position, a high success rate of the surrounding can be achieved when the standard value is set to 900 m.
As shown in Figure 23, the orange AUV represents the AUV formation, the blue trajectory represents the trajectory of the AUV formation, the black AUV represents the targets, and the red trajectory represents the target trajectory. Completion of the surrounding process involves the following steps. First, the AUV formation verifies the presence of the target. The AUV formation then adjusts to ensure it is in the best position before the target arrives. When AUV No. 1 identifies the target, it initiates communication with AUV No. 2 and AUV No. 3 on its flanks. In response, AUVs No. 2 and No. 3 are located in advantageous locations. At the same time, AUV No. 1 follows and pushes the target to the final stage of surrounding. The above surrounding process achieves successful completion of the target surrounding maneuver by the AUV formation.

3.7. The Effectiveness Analysis of Surrounding Attack

In this section, a test environment based on sparse probing is designed to verify the dependence of the proposed algorithm on LSTM in the sparse detectable environment. Basically, during each test, the AUVs have a certain 30% probability of losing the position state of the target at each time step. Based on this, we make a comparison between using the LSTM model to predict the target trajectory and not using the LSTM model. when the LSTM prediction is not used, the position state of the target under the detectable failure condition is set to zero as input to the actor network, and when the LSTM prediction is used, the input to the actor network is the prediction result of the LSTM network. In addition, the number of AUVs in our formation is set to four, and each set of data is set to run 100 times independently. The successful condition of the surrounding attack is set as when there are more than two AUVs within 200 m of the target. The success rate in different conditions is illustrated in Table 2, where nine instances of different conditions with three target velocities and three target maneuver angles are demonstrated. And the AUV’s trajectory on the instance of 8 knot 60 maneuver angle with not using the LSTM model and using the LSTM model is illustrated in Figure 24 and Figure 25, respectively.
According to the experimental results, the application of the prediction method based on the LSTM network improves the proposed surrounding attack method by a 10 to 20 percent success rate, which demonstrates the necessity of LSTM prediction in the process of surrounding. Furthermore, the AUV formation without LSTM prediction under the condition of sparse detection showed inferior performance, because as it gets closer to the target, the wrong position information makes some AUVs deviate from the target direction, which demonstrates the importance of LSTM prediction for formation synergy.

3.8. Discussion of AUVs Collaborative Environment

In the actual process of marine resources development, a safe and stable underwater environment is crucial. However, due to the underwater communication environment and the dynamic constraints of AUVs, the collaborative operation of AUVs is challenged greatly. In addition, a collaborative task often contains multiple mission objectives and requires planning different formation modes. Based on this, we propose a method of formation ferry, formation detection, and formation surrounding attack, which is suitable for the known partial target prior information, in order to cope with the complex task requirements of the formation control method, and based on this, a multi-cooperative task virtual environment can be simulated for multiple tasks. Nevertheless, the present study does not encompass the influence of ocean currents on AUVs and energy consumption within AUVs. In future endeavors, there is an opportunity to fortify our research in these particular domains.

4. Conclusions

This paper offers a comprehensive simulation of the entire mission execution process for a multi-AUV formation, including path planning, formation design methodology, collaborative search strategies, target trajectory prediction, cooperative surrounding techniques, and coordinated pursuit within multi-AUV formations. Firstly, this paper introduces a solution to the AUV path planning problem based on ocean information from the underwater platform. It develops a feasible path while considering the operational limitations of the AUV. Subsequently, considering communication limitations, this paper addresses the reconstruction and formation establishment of AUV formations. It utilizes an artificial potential field approach for formation navigation, ensuring robust and stable formation control. The article proposes a prediction scheme based on the LSTM neural network to predict the trajectory of the target after detection. This approach learns target motion characteristics, enabling the AUV formation to strategically surround the target during the decision-making process. After identifying the target location, the AUV formation initiates a surrounding mission. Finally, a high success rate in surrounding can be achieved when the standard value is set to 900 m.
In future work, we will consider more complex ocean environments to address the complexity of real-world ocean scenes and provide more practical guidance for AUV formations.

Author Contributions

Conceptualization, Z.W. (Zhiwen Wen), Z.W. (Zhong Wang) and D.Z.; methodology, Z.W. (Zhiwen Wen), Z.W. (Zhong Wang) and H.D.; software, Z.W. (Zhiwen Wen), D.Q., Y.J. and J.L.; writing—original draft preparation, Z.W. (Zhiwen Wen) and Z.W. (Zhong Wang); visualization, D.Q., Y.J. and J.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China (Grant No. 52175251, 52205268). The Industry Key Technology Research Fund Project of North-western Polytechnical University (HYGJXM202318) is gratefully acknowledged. This re-search work is also supported by the National Basic Scientific Research Program under Grant of JCKY2021206B005.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data available on request from the authors.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Xin, B.; Zhang, J.; Chen, J.; Wang, Q.; Qu, Y. Overview of Research on Transformation of Multi-AUV Formations. Complex Syst. Model. Simul. 2021, 1, 1–14. [Google Scholar] [CrossRef]
  2. Yang, Y.; Xiao, Y.; Li, T. A Survey of Autonomous Underwater Vehicle Formation: Performance, Formation Control, and Communication Capability. IEEE Commun. Surv. Tutor. 2021, 23, 815–841. [Google Scholar] [CrossRef]
  3. Wang, Q.; He, B.; Zhang, Y.; Yu, F.; Huang, X.; Yang, R. An autonomous cooperative system of multi-AUV for underwater targets detection and localization. Eng. Appl. Artif. Intell. 2023, 121, 105907. [Google Scholar] [CrossRef]
  4. Ma, X.; Chen, Y.; Bai, G.; Liu, J. Multi-AUV Collaborative Operation Based on Time-Varying Navigation Map and Dynamic Grid Model. IEEE Access 2020, 8, 159424–159439. [Google Scholar] [CrossRef]
  5. Chai, H.; Du, Z.; Xiang, M.; Huang, Z. The research status and development trend of UUVs cooperative localization technology. Bull. Surv. Mapp. 2022, 62, 62–67+92. [Google Scholar]
  6. Zhang, J.Y.; Ning, X.; Ma, S.C. An improved particle swarm optimization based on age factor for multi-AUV cooperative planning. Ocean Eng. 2023, 287, 115753. [Google Scholar] [CrossRef]
  7. Hu, X.Y.; Shi, Y.; Bai, G.Q.; Chen, Y.L. Collaborative Search and Target Capture of AUV Formations in Obstacle Environments. Appl. Sci. 2023, 13, 9016. [Google Scholar] [CrossRef]
  8. Qin, H.D.; Si, J.S.; Wang, N.; Gao, L.Y.; Shao, K.J. Disturbance Estimator-Based Nonsingular Fast Fuzzy Terminal Sliding-Mode Formation Control of Autonomous Underwater Vehicles. Int. J. Fuzzy Syst. 2023, 25, 395–406. [Google Scholar] [CrossRef]
  9. Pang, W.; Zhu, D.Q.; Yang, S.X. A novel time-varying formation obstacle avoidance algorithm for multiple AUVs. Int. J. Robot. Autom. 2023, 38, 194–207. [Google Scholar] [CrossRef]
  10. Huang, Z.; Zhu, D.; Sun, B. A multi-AUV cooperative hunting method in 3-D underwater environment with obstacle. Eng. Appl. Artif. Intell. 2016, 50, 192–200. [Google Scholar] [CrossRef]
  11. Liang, H.; Fu, Y.; Kang, F.; Gao, J.; Qiang, N. A Behavior-Driven Coordination Control Framework for Target Hunting by UUV Intelligent Swarm. IEEE Access 2020, 8, 4838–4859. [Google Scholar] [CrossRef]
  12. Cao, X.; Xu, X. Hunting Algorithm for Multi-AUV Based on Dynamic Prediction of Target Trajectory in 3D Underwater Environment. IEEE Access 2020, 8, 138529–138538. [Google Scholar] [CrossRef]
  13. Petritoli, E.; Cagnetti, M.; Leccese, F. Simulation of autonomous underwater vehicles (auvs) swarm diffusion. Sensors 2020, 20, 4950. [Google Scholar] [CrossRef] [PubMed]
  14. Fossen, T.I. Guidance and Control of Ocean Vehicles. Ph.D. Thesis, University of Trondheim, Trondheim, Norway, 1999. [Google Scholar]
  15. Zhong, J.; Xiang, G.; Dian, S. Efficient RRT* path planning algorithm for complex environments with narrow passages. J. Appl. Res. Comput. 2021, 38, 23082314. [Google Scholar]
  16. Zhang, M.; Cai, W. Underwater targets tracking path planning based on task cooperation of multiple AUVs. Chin. J. Sens. Actuators 2018, 31, 1101–1107. [Google Scholar]
  17. Duan, H.; Zhao, J.; Deng, Y.; Shi, Y.; Ding, X. Dynamic discrete pigeon-inspired optimization for multi-UAV cooperative search-attack mission planning. IEEE Trans. Aerosp. Electron. Syst. 2020, 57, 706–720. [Google Scholar] [CrossRef]
  18. Yu, Y.; Si, X.; Hu, C.; Zhang, J. A Review of Recurrent Neural Networks: LSTM Cells and Network Architectures. Neural Comput. 2019, 31, 1235–1270. [Google Scholar] [CrossRef] [PubMed]
  19. Yin, W.; Kann, K.; Yu, M.; Schütze, H. Comparative Study of CNN and RNN for Natural Language Processing. arXiv 2017, arXiv:1702.01923. [Google Scholar]
  20. Lowe, R.; Wu, Y.I.; Tamar, A.; Harb, J.; Pieter Abbeel, O.; Mordatch, I. Multi-agent actor-critic for mixed cooperative-competitive environments. In Proceedings of the NIPS’17: Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017. [Google Scholar]
  21. Taieb, S.B.; Sorjamaa, A.; Bontempi, G. Multiple-output modeling for multi-step-ahead time series forecasting. Neurocomputing 2010, 73, 1950–1957. [Google Scholar] [CrossRef]
Figure 1. A complete simulation process of multi-AUVs formation execution mission. The simulation process delineates the complete execution of a multi-AUV formation mission. The diagram bifurcates into two segments. The left side portrays the AUV fleet engaging in a long-range transit, moving from the starting point towards the target point, preparing for the collaborative search phase. On the right side, subsequent to reaching the target point, the AUV fleet initiates a collaborative search employing the optimized formation. Upon target detection, the fleet predicts the target’s trajectory, facilitating interception.
Figure 1. A complete simulation process of multi-AUVs formation execution mission. The simulation process delineates the complete execution of a multi-AUV formation mission. The diagram bifurcates into two segments. The left side portrays the AUV fleet engaging in a long-range transit, moving from the starting point towards the target point, preparing for the collaborative search phase. On the right side, subsequent to reaching the target point, the AUV fleet initiates a collaborative search employing the optimized formation. Upon target detection, the fleet predicts the target’s trajectory, facilitating interception.
Sensors 24 00437 g001
Figure 2. The proposed method consists of long-range ferry, collaborative detection, target trajectory prediction, and collaborative surrounding four stages.
Figure 2. The proposed method consists of long-range ferry, collaborative detection, target trajectory prediction, and collaborative surrounding four stages.
Sensors 24 00437 g002
Figure 3. RRT flowchart. In the initial phase of the algorithm, RRT designates the starting point as the root node and then randomly samples a point q r a n d . Subsequently, RRT identifies the nearest node q n e a r , in the tree structure and establishes a connection between q n e a r and q r a n d . If the path between q n e w and q r a n d does not intersect with obstacles, RRT adds q n e w to the tree. During each iteration, RRT checks if the new node has reached the goal. If so, RRT generates the final path and concludes the algorithm. This iterative process involves continuously expanding new nodes in the tree to rapidly explore feasible paths until reaching the specified number of iterations or satisfying the termination conditions.
Figure 3. RRT flowchart. In the initial phase of the algorithm, RRT designates the starting point as the root node and then randomly samples a point q r a n d . Subsequently, RRT identifies the nearest node q n e a r , in the tree structure and establishes a connection between q n e a r and q r a n d . If the path between q n e w and q r a n d does not intersect with obstacles, RRT adds q n e w to the tree. During each iteration, RRT checks if the new node has reached the goal. If so, RRT generates the final path and concludes the algorithm. This iterative process involves continuously expanding new nodes in the tree to rapidly explore feasible paths until reaching the specified number of iterations or satisfying the termination conditions.
Sensors 24 00437 g003
Figure 4. The RRT (Fast Random Tree Method) planning path. The red point signifies the target point, the blue point represents the AUV, and the blue curve out-lines the planned trajectory. The gray area denotes the obstacle zone with a weight of 0.5, while the red area signifies the obstacle area with a weight of 0.9.
Figure 4. The RRT (Fast Random Tree Method) planning path. The red point signifies the target point, the blue point represents the AUV, and the blue curve out-lines the planned trajectory. The gray area denotes the obstacle zone with a weight of 0.5, while the red area signifies the obstacle area with a weight of 0.9.
Sensors 24 00437 g004
Figure 5. Artificial potential field obstacle avoidance principle. The three black dots symbolize the obstacles, with the fan-shaped area representing the AUV’s detection range. The three black arrows illustrate the repulsive forces exerted by the obstacles on the AUV, while the red arrow represents the combined force resulting from these three repulsive forces.
Figure 5. Artificial potential field obstacle avoidance principle. The three black dots symbolize the obstacles, with the fan-shaped area representing the AUV’s detection range. The three black arrows illustrate the repulsive forces exerted by the obstacles on the AUV, while the red arrow represents the combined force resulting from these three repulsive forces.
Sensors 24 00437 g005
Figure 6. Formation structure. D represents the communication distance between AUVs.
Figure 6. Formation structure. D represents the communication distance between AUVs.
Sensors 24 00437 g006
Figure 7. Environment probabilities. After the AUV formation traverses a specific area, the information entropy map is updated based on the detection information gathered by the AUV fleet. A probability of 1 or 0 indicates that the information about the target in that region is certain, signifying the presence or absence of a target, respectively. A probability of 0.5 signifies the highest level of uncertainty in the information for that area, representing a completely unknown state. The dashed areas signify more distant environmental regions.
Figure 7. Environment probabilities. After the AUV formation traverses a specific area, the information entropy map is updated based on the detection information gathered by the AUV fleet. A probability of 1 or 0 indicates that the information about the target in that region is certain, signifying the presence or absence of a target, respectively. A probability of 0.5 signifies the highest level of uncertainty in the information for that area, representing a completely unknown state. The dashed areas signify more distant environmental regions.
Sensors 24 00437 g007
Figure 8. Target localization for cooperative detection tasks based on Shannon entropy. The red arrow indicates the direction in which the target is more likely to move, and the blue arrow indicates the direction in which the target is less likely to move.
Figure 8. Target localization for cooperative detection tasks based on Shannon entropy. The red arrow indicates the direction in which the target is more likely to move, and the blue arrow indicates the direction in which the target is less likely to move.
Sensors 24 00437 g008
Figure 9. Single-layer LSTM structure diagram.
Figure 9. Single-layer LSTM structure diagram.
Sensors 24 00437 g009
Figure 10. The input and the output of the LSTM network.
Figure 10. The input and the output of the LSTM network.
Sensors 24 00437 g010
Figure 11. Schematic diagram of collaborative rounding.
Figure 11. Schematic diagram of collaborative rounding.
Sensors 24 00437 g011
Figure 12. Simultaneous approach strategy in the limited communication environment.
Figure 12. Simultaneous approach strategy in the limited communication environment.
Sensors 24 00437 g012
Figure 13. AUV formation surrounding attack reward map.
Figure 13. AUV formation surrounding attack reward map.
Sensors 24 00437 g013
Figure 14. The RRT algorithm plots 4 paths in different situations. To assess the efficacy of the Rapidly Exploring Random Tree (RRT) algorithm, various task scenarios were devised for reliability verification. The red point signifies the target point, the blue point represents the AUV, and the blue curve outlines the planned trajectory. The gray area denotes the obstacle zone with a weight of 0.5, while the red area signifies the obstacle area with a weight of 0.9. In (a), the target occupies the upper left, with the Unmanned Underwater Vehicle (UUV) situated in the lower left. In (b), the target is positioned in the upper right, while the UUV remains in the lower left. (c) depicts the target in the upper left, and the UUV in the lower left. Finally, in (d), the target is located in the upper right, and the UUV is in the lower right. The consistent success across diverse scenarios underscores the robust applicability of the RRT algorithm.
Figure 14. The RRT algorithm plots 4 paths in different situations. To assess the efficacy of the Rapidly Exploring Random Tree (RRT) algorithm, various task scenarios were devised for reliability verification. The red point signifies the target point, the blue point represents the AUV, and the blue curve outlines the planned trajectory. The gray area denotes the obstacle zone with a weight of 0.5, while the red area signifies the obstacle area with a weight of 0.9. In (a), the target occupies the upper left, with the Unmanned Underwater Vehicle (UUV) situated in the lower left. In (b), the target is positioned in the upper right, while the UUV remains in the lower left. (c) depicts the target in the upper left, and the UUV in the lower left. Finally, in (d), the target is located in the upper right, and the UUV is in the lower right. The consistent success across diverse scenarios underscores the robust applicability of the RRT algorithm.
Sensors 24 00437 g014
Figure 15. Two possible topological structures of the formation. (a) Each AUV establishes a communication relationship with its two connected AUVs, and the entire communication network is in a stable state; (b) AUV III only establishes a communication relationship with AUV I, so while AUV I, AUV II, and Main AUV can communicate stably, AUV III is unstable in the communication network.
Figure 15. Two possible topological structures of the formation. (a) Each AUV establishes a communication relationship with its two connected AUVs, and the entire communication network is in a stable state; (b) AUV III only establishes a communication relationship with AUV I, so while AUV I, AUV II, and Main AUV can communicate stably, AUV III is unstable in the communication network.
Sensors 24 00437 g015
Figure 16. Parametric model of the formation structure.
Figure 16. Parametric model of the formation structure.
Sensors 24 00437 g016
Figure 17. Schematic diagram of formation detection.
Figure 17. Schematic diagram of formation detection.
Sensors 24 00437 g017
Figure 18. Iterative diagram of the target.
Figure 18. Iterative diagram of the target.
Sensors 24 00437 g018
Figure 19. Optimized formation diagram. The four red circles represent the optimal locations for each of the four underwater vehicles.
Figure 19. Optimized formation diagram. The four red circles represent the optimal locations for each of the four underwater vehicles.
Sensors 24 00437 g019
Figure 20. Navigation detection of AUV formations. This figure illustrates the AUV formation aligned with the optimized configuration for collaborative detection tasks. The dotted green lines symbolize the communication chain. The AUVs sporting blue heads acting as followers, while the blue lines depict the paths of the followers. The leader, distinguished by a red head, is denoted by solid red lines outlining its trajectory.
Figure 20. Navigation detection of AUV formations. This figure illustrates the AUV formation aligned with the optimized configuration for collaborative detection tasks. The dotted green lines symbolize the communication chain. The AUVs sporting blue heads acting as followers, while the blue lines depict the paths of the followers. The leader, distinguished by a red head, is denoted by solid red lines outlining its trajectory.
Sensors 24 00437 g020
Figure 21. The prediction effectiveness obtained by the LSTM network on the conditions of different target maneuver angle. (a) The prediction effect of the LSTM network under the condition that the maneuver Angle of the target is 15 degrees, the orange dots and the blue dots correspond to the predicted and actual positions of the target respectively; (b) The prediction effect of the LSTM network under the condition that the maneuver Angle of the target is 30 degrees, the orange dots and the blue dots correspond to the predicted and actual positions of the target respectively; (c) The prediction effect of the LSTM network under the condition that the maneuver Angle of the target is 60 degrees, the orange dots and the blue dots correspond to the predicted and actual positions of the target respectively; (d) The prediction effect of the LSTM network under the condition that the maneuver Angle of the target is 90 degrees, the orange dots and the blue dots correspond to the predicted and actual positions of the target respectively.
Figure 21. The prediction effectiveness obtained by the LSTM network on the conditions of different target maneuver angle. (a) The prediction effect of the LSTM network under the condition that the maneuver Angle of the target is 15 degrees, the orange dots and the blue dots correspond to the predicted and actual positions of the target respectively; (b) The prediction effect of the LSTM network under the condition that the maneuver Angle of the target is 30 degrees, the orange dots and the blue dots correspond to the predicted and actual positions of the target respectively; (c) The prediction effect of the LSTM network under the condition that the maneuver Angle of the target is 60 degrees, the orange dots and the blue dots correspond to the predicted and actual positions of the target respectively; (d) The prediction effect of the LSTM network under the condition that the maneuver Angle of the target is 90 degrees, the orange dots and the blue dots correspond to the predicted and actual positions of the target respectively.
Sensors 24 00437 g021
Figure 22. Multi-AUVs formation surrounding success rate.
Figure 22. Multi-AUVs formation surrounding success rate.
Sensors 24 00437 g022
Figure 23. The process of AUV surrounding targets. This figure depicts the three orange AUVs completing a process for surrounding black target. The blue tracks outline their trajectories, while the red line shows the target’s motion.
Figure 23. The process of AUV surrounding targets. This figure depicts the three orange AUVs completing a process for surrounding black target. The blue tracks outline their trajectories, while the red line shows the target’s motion.
Sensors 24 00437 g023
Figure 24. The AUV trajectory without using LSTM prediction on the condition of 8 knot and 60 degrees where the target is set as purple. The blue, orange, red, and green tracks depict the motion trajectories of the four AUVs.
Figure 24. The AUV trajectory without using LSTM prediction on the condition of 8 knot and 60 degrees where the target is set as purple. The blue, orange, red, and green tracks depict the motion trajectories of the four AUVs.
Sensors 24 00437 g024
Figure 25. The AUV trajectory using LSTM prediction on the condition of 8 knot and 60 degrees where the target is set as purple. The blue, orange, red, and green tracks depict the motion trajectories of the four AUVs.
Figure 25. The AUV trajectory using LSTM prediction on the condition of 8 knot and 60 degrees where the target is set as purple. The blue, orange, red, and green tracks depict the motion trajectories of the four AUVs.
Sensors 24 00437 g025
Table 1. Optimization results.
Table 1. Optimization results.
AlgorithmDesign VariablesObjective Variable
PSO (w = 0.4)0.691358.972.371294.732.271529.89−0.80
PSO (w = 1)−0.872000.004.012000.004.012000.00−1.46
PSO (w = 2)−0.872000.004.012000.004.012000.00−1.47
GA−0.861865.294.011870.174.011860.75−1.37
Table 2. Surrounding attack success rates using different trajectory prediction methods on 9 target instances. The black upward arrow signifies a rise in success rate, while the downward arrow denotes a decrease in success rate.
Table 2. Surrounding attack success rates using different trajectory prediction methods on 9 target instances. The black upward arrow signifies a rise in success rate, while the downward arrow denotes a decrease in success rate.
Target
Velocity (knot)
Maneuver AngleNot Using the LSTM Model
Success Rate
Using the LSTM Model
Success Rate
69084% ↓100% ↑
66077% ↓100% ↑
64574% ↓84% ↑
89080% ↓98% ↑
86072% ↓100% ↑
84523% ↓79% ↑
109079% ↓98% ↑
106079% ↓100% ↑
104523% ↓72% ↑
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Wen, Z.; Wang, Z.; Zhou, D.; Qin, D.; Jiang, Y.; Liu, J.; Dong, H. Research on Multiple-AUVs Collaborative Detection and Surrounding Attack Simulation. Sensors 2024, 24, 437. https://doi.org/10.3390/s24020437

AMA Style

Wen Z, Wang Z, Zhou D, Qin D, Jiang Y, Liu J, Dong H. Research on Multiple-AUVs Collaborative Detection and Surrounding Attack Simulation. Sensors. 2024; 24(2):437. https://doi.org/10.3390/s24020437

Chicago/Turabian Style

Wen, Zhiwen, Zhong Wang, Daming Zhou, Dezhou Qin, Yichen Jiang, Junchang Liu, and Huachao Dong. 2024. "Research on Multiple-AUVs Collaborative Detection and Surrounding Attack Simulation" Sensors 24, no. 2: 437. https://doi.org/10.3390/s24020437

APA Style

Wen, Z., Wang, Z., Zhou, D., Qin, D., Jiang, Y., Liu, J., & Dong, H. (2024). Research on Multiple-AUVs Collaborative Detection and Surrounding Attack Simulation. Sensors, 24(2), 437. https://doi.org/10.3390/s24020437

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop