Improving Control Performance of Unmanned Aerial Vehicles through Shared Experience

261 Accesses
2 Citations
Explore all metrics

A Correction to this article was published on 22 January 2022

This article has been updated

Abstract

This work proposes a novel approach for improving the control performance of Unmanned Aerial Vehicles (UAVs) through cooperative reinforcement learning. By sharing their experience, it is shown that multiple UAVs can work together to converge on a set of optimal Model Predictive Control (MPC) parameters faster than when working on their own. In order to benefit from this shared experience, the UAVs must coordinate their learning strategies. Here, we proposed a Leader-Follower approach, whereby the Leader ensures all trials are drawn from the same distribution and contribute to a common payoff game of Learning Automata. Experimental results show that this approach results in faster learning without any loss of performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Cooperative Search Strategy of Multi-UAVs Based on Reinforcement Learning

Multi-agent Reinforcement Learning for Unmanned Aerial Vehicle Capture-the-Flag Game Behavior

Leader–follower UAVs formation control based on a deep Q-network collaborative framework

Article Open access 26 February 2024

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Change history

22 January 2022
A Correction to this paper has been published: https://doi.org/10.1007/s10846-021-01559-z

References

Abdolhosseini, M., Zhang, Y., Rabbath, C.: Trajectory tracking with model predictive control for an unmanned quad-rotor helicopter: Theory and flight test results. In: Proceedings of the 5th International Conference on Intelligent Robotics and Applications - Volume Part I, pp 411–420. Springer, Berlin (2012)
Atia, M., Donnelly, C., Noureldin, A., Korenberg, M.: A novel systems integration approach for multi-sensor integrated navigation systems. In: 2014 IEEE International Systems Conference Proceedings, pp. 554–558. https://doi.org/10.1109/SysCon.2014.6819310 (2014)
Ayyad, A., Chehadeh, M., Awad, M.I., Zweiri, Y.: Real-time system identification using deep learning for linear processes with application to unmanned aerial vehicles. IEEE Access 8, 122539–122553 (2020)
Article Google Scholar
Bemporad, A.: A quadratic programming algorithm based on nonnegative least squares with applications to embedded model predictive control. IEEE Trans. Autom. Control 61(4), 1111–1116 (2016). https://doi.org/10.1109/TAC.2015.2459211
Article MathSciNet MATH Google Scholar
Burgard, W., Moors, M., Stachniss, C., Schneider, F.E.: Coordinated multi-robot exploration. IEEE Trans. Robot. 21(3), 376–386 (2005)
Article Google Scholar
Cao, G., Lai, E.M.K., Alam, F.: Gaussian process model predictive control of an unmanned quadrotor. J. Intell. Robot. Syst. 88(1), 147–162 (2017). https://doi.org/10.1007/s10846-017-0549-y
Article Google Scholar
D’Amato, E., Mattei, M., Notaro, I.: Distributed reactive model predictive control for collision avoidance of unmanned aerial vehicles in civil airspace. Journal of Intelligent & Robotic Systems. https://doi.org/10.1007/s10846-019-01047-5 (2019)
Dentler, J., Rosalie, M., Danoy, G., Bouvry, P., Kannan, S., Olivares-Mendez, M.A., Voos, H.: Collision avoidance effects on the mobility of a uav swarm using chaotic ant colony with model predictive control. J. Intell. Robot. Syst. 93(1), 227–243 (2019). https://doi.org/10.1007/s10846-018-0822-8
Article Google Scholar
Devia, C.A., Rojas, J.P., Petro, E., Martinez, C., Mondragon, I.F., Patino, D., Rebolledo, M.C., Colorado, J.: High-throughput biomass estimation in rice crops using uav multispectral imagery. J. Intell. Robot. Syst. 96(3), 573–589 (2019). https://doi.org/10.1007/s10846-019-01001-5
Article Google Scholar
Dudek, G., Jenkin, M., Milios, E., Wilkes, D.: A taxonomy for multi-agent robotics. Auton. Robot. 3, 375–397 (1996)
Article Google Scholar
Emami, S.A., Banazadeh, A.: Online identification of aircraft dynamics in the presence of actuator faults. J. Intell. Robot. Syst. 96(3), 541–553 (2019). https://doi.org/10.1007/s10846-019-00998-z
Article Google Scholar
Hafez, A., Iskandarani, M., Givigi, S., Yousefi, S., Rabbath, C.A., Beaulieu, A.: Using linear model predictive control via feedback linearization for dynamic encirclement. In: Proc. of the American Control Conf., pp. 3868–3873. https://doi.org/10.1109/ACC.2014.6858619 (2014)
Jardine, P.T., Kogan, M., Givigi, S.N., Yousefi, S.: Adaptive predictive control of a differential drive robot tuned with reinforcement learning. Int. J. Adapt. Control Signal Process. 33(2), 410–423 (2019). https://doi.org/10.1002/acs.2882
Article MathSciNet MATH Google Scholar
Kamesh, R., Rani, K.Y.: Novel formulation of adaptive MPC as EKF using ANN model: Multiproduct semibatch polymerization reactor case study. IEEE Trans. Neural Netw. Learn. Syst. 28(12), 3061–3073 (2017). https://doi.org/10.1109/TNNLS.2016.2614878
Article MathSciNet Google Scholar
Lecointe, M., Chanel, C.P.C., Defay, F.: Backstepping control law application to path tracking with an indoor quadrotor. In: Proceedings of European Aerospace Guidance Navigation and Control Conference (EuroGNC), Toulouse, FR, pp. 1–19. http://oatao.univ-toulouse.fr/14669/ (2015)
Lian, C., Xu, X., Chen, H., He, H.: Near-optimal tracking control of mobile robots via receding-horizon dual heuristic programming. IEEE Trans. Cybern. 46(11), 2484–2496 (2016). https://doi.org/10.1109/TCYB.2015.2478857
Article Google Scholar
Ljung, L., Hjalmarsson, H., Ohlsson, H.: Four encounters with system identification. European Journal of Control 17(5), 449–471 (2011). https://doi.org/10.3166/ejc.17.449-471, http://www.sciencedirect.com/science/article/pii/S0947358011709712
Article MathSciNet Google Scholar
Mayne, D., Rawlings, J., Rao, C., Scokaert, P.: Constrained model predictive control: Stability and optimality. Automatica 36(6), 789–814 (2000)
Article MathSciNet Google Scholar
Meier, L., Tanskanen, P., Heng, L., Lee, G.H., Fraundorfer, F., Pollefeys, M.: Pixhawk: a micro aerial vehicle design for autonomous flight using onboard computer vision. Auton Robots 33(1-2), 21–39 (2012). https://doi.org/10.1007/s10514-012-9281-4
Article Google Scholar
Mouhacine, B.: Model-based vs data-driven adaptive control: an overview. Int. J. Adapt. Control Signal Process. 32(5), 753–776 (2017). https://doi.org/10.1002/acs.2862
Article MathSciNet MATH Google Scholar
Narendra, K.S., Thathachar, M.A.L.: Learning Automata: An Introduction. Prentice-Hall, Inc., Upper Saddle River (1989)
Google Scholar
Nowé, A, Verbeeck, K., Peeters, M.: Learning automata as a basis for multi agent reinforcement learning. In: Tuyls, K., Hoen, P.J., Verbeeck, K., Sen, S (eds.) Learning and Adaption in Multi-Agent Systems, pp 71–85. Springer, Berlin (2006)
Petrović VM: Artificial intelligence and virtual worlds – toward human-level ai agents. IEEE Access 6, 39976–39988 (2018)
Article Google Scholar
Raslan, H., Schwartz, H., Givigi, S.: A learning invader for the “guarding a territory” game. J. Intell. Robot. Syst. 83(1), 55–70 (2016). https://doi.org/10.1007/s10846-015-0317-9
Article Google Scholar
Rossiter, J.A.: Model-Based Predictive Control: A Practical Approach, 1st edn. CRC Press LLC, Boca Raton (2004)
Google Scholar
Samal, M.K., Garratt, M., Pota, H., Sangani, H.T.: Model predictive flight controller for longitudinal and lateral cyclic control of an unmanned helicopter. In: 2012 2nd Australian Control Conference, pp. 386–391 (2012)
dos Santos, S.R.B., Givigi, S.N., Nascimento, C.L.: Autonomous construction of multiple structures using learning automata: Description and experimental validation. IEEE Syst. J. 9(4), 1376–1387 (2015). https://doi.org/10.1109/JSYST.2014.2374334
Article Google Scholar
Sidney N Givigi, J., Schwartz, H.M.: Decentralized strategy selection with learning automata for multiple pursuer–evader games. Adapt. Behav. 22(4), 221–234 (2014). https://doi.org/10.1177/1059712314526261
Article Google Scholar
Thathachar, M.A.L., Arvind, M.T.: Parallel algorithms for modules of learning automata. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics) 28(1), 24–33 (1998). https://doi.org/10.1109/3477.658575
Article Google Scholar
Thathachar, M.A.L., Sastry, P.S.: Networks of Learning Automata: Techniques for Online Stochastic Optimization. Springer, New York (2003)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electrical and Computer Engineering, Royal Military College of Canada, Kingston, Canada
Peter Travis Jardine
School of Computing, Queen’s University, Kingston, Canada
Sidney Givigi

Authors

Peter Travis Jardine
View author publications
You can also search for this author in PubMed Google Scholar
Sidney Givigi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sidney Givigi.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The original online version of this article was revised: In this article ref. 23 was incorrect and should have been “Petrović VM: Artificial intelligence and virtual worlds – toward human-level ai agents. IEEE Access 6, 39976–39988 (2018)”

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jardine, P.T., Givigi, S. Improving Control Performance of Unmanned Aerial Vehicles through Shared Experience. J Intell Robot Syst 102, 68 (2021). https://doi.org/10.1007/s10846-021-01387-1

Download citation

Received: 22 March 2020
Accepted: 31 March 2021
Published: 25 June 2021
DOI: https://doi.org/10.1007/s10846-021-01387-1

Improving Control Performance of Unmanned Aerial Vehicles through Shared Experience

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Cooperative Search Strategy of Multi-UAVs Based on Reinforcement Learning

Multi-agent Reinforcement Learning for Unmanned Aerial Vehicle Capture-the-Flag Game Behavior

Leader–follower UAVs formation control based on a deep Q-network collaborative framework

Change history

22 January 2022

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Improving Control Performance of Unmanned Aerial Vehicles through Shared Experience

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Cooperative Search Strategy of Multi-UAVs Based on Reinforcement Learning

Multi-agent Reinforcement Learning for Unmanned Aerial Vehicle Capture-the-Flag Game Behavior

Leader–follower UAVs formation control based on a deep Q-network collaborative framework

Explore related subjects

Change history

22 January 2022

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation