Abstract
This work proposes a novel approach for improving the control performance of Unmanned Aerial Vehicles (UAVs) through cooperative reinforcement learning. By sharing their experience, it is shown that multiple UAVs can work together to converge on a set of optimal Model Predictive Control (MPC) parameters faster than when working on their own. In order to benefit from this shared experience, the UAVs must coordinate their learning strategies. Here, we proposed a Leader-Follower approach, whereby the Leader ensures all trials are drawn from the same distribution and contribute to a common payoff game of Learning Automata. Experimental results show that this approach results in faster learning without any loss of performance.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Change history
22 January 2022
A Correction to this paper has been published: https://doi.org/10.1007/s10846-021-01559-z
References
Abdolhosseini, M., Zhang, Y., Rabbath, C.: Trajectory tracking with model predictive control for an unmanned quad-rotor helicopter: Theory and flight test results. In: Proceedings of the 5th International Conference on Intelligent Robotics and Applications - Volume Part I, pp 411–420. Springer, Berlin (2012)
Atia, M., Donnelly, C., Noureldin, A., Korenberg, M.: A novel systems integration approach for multi-sensor integrated navigation systems. In: 2014 IEEE International Systems Conference Proceedings, pp. 554–558. https://doi.org/10.1109/SysCon.2014.6819310 (2014)
Ayyad, A., Chehadeh, M., Awad, M.I., Zweiri, Y.: Real-time system identification using deep learning for linear processes with application to unmanned aerial vehicles. IEEE Access 8, 122539–122553 (2020)
Bemporad, A.: A quadratic programming algorithm based on nonnegative least squares with applications to embedded model predictive control. IEEE Trans. Autom. Control 61(4), 1111–1116 (2016). https://doi.org/10.1109/TAC.2015.2459211
Burgard, W., Moors, M., Stachniss, C., Schneider, F.E.: Coordinated multi-robot exploration. IEEE Trans. Robot. 21(3), 376–386 (2005)
Cao, G., Lai, E.M.K., Alam, F.: Gaussian process model predictive control of an unmanned quadrotor. J. Intell. Robot. Syst. 88(1), 147–162 (2017). https://doi.org/10.1007/s10846-017-0549-y
D’Amato, E., Mattei, M., Notaro, I.: Distributed reactive model predictive control for collision avoidance of unmanned aerial vehicles in civil airspace. Journal of Intelligent & Robotic Systems. https://doi.org/10.1007/s10846-019-01047-5 (2019)
Dentler, J., Rosalie, M., Danoy, G., Bouvry, P., Kannan, S., Olivares-Mendez, M.A., Voos, H.: Collision avoidance effects on the mobility of a uav swarm using chaotic ant colony with model predictive control. J. Intell. Robot. Syst. 93(1), 227–243 (2019). https://doi.org/10.1007/s10846-018-0822-8
Devia, C.A., Rojas, J.P., Petro, E., Martinez, C., Mondragon, I.F., Patino, D., Rebolledo, M.C., Colorado, J.: High-throughput biomass estimation in rice crops using uav multispectral imagery. J. Intell. Robot. Syst. 96(3), 573–589 (2019). https://doi.org/10.1007/s10846-019-01001-5
Dudek, G., Jenkin, M., Milios, E., Wilkes, D.: A taxonomy for multi-agent robotics. Auton. Robot. 3, 375–397 (1996)
Emami, S.A., Banazadeh, A.: Online identification of aircraft dynamics in the presence of actuator faults. J. Intell. Robot. Syst. 96(3), 541–553 (2019). https://doi.org/10.1007/s10846-019-00998-z
Hafez, A., Iskandarani, M., Givigi, S., Yousefi, S., Rabbath, C.A., Beaulieu, A.: Using linear model predictive control via feedback linearization for dynamic encirclement. In: Proc. of the American Control Conf., pp. 3868–3873. https://doi.org/10.1109/ACC.2014.6858619 (2014)
Jardine, P.T., Kogan, M., Givigi, S.N., Yousefi, S.: Adaptive predictive control of a differential drive robot tuned with reinforcement learning. Int. J. Adapt. Control Signal Process. 33(2), 410–423 (2019). https://doi.org/10.1002/acs.2882
Kamesh, R., Rani, K.Y.: Novel formulation of adaptive MPC as EKF using ANN model: Multiproduct semibatch polymerization reactor case study. IEEE Trans. Neural Netw. Learn. Syst. 28(12), 3061–3073 (2017). https://doi.org/10.1109/TNNLS.2016.2614878
Lecointe, M., Chanel, C.P.C., Defay, F.: Backstepping control law application to path tracking with an indoor quadrotor. In: Proceedings of European Aerospace Guidance Navigation and Control Conference (EuroGNC), Toulouse, FR, pp. 1–19. http://oatao.univ-toulouse.fr/14669/ (2015)
Lian, C., Xu, X., Chen, H., He, H.: Near-optimal tracking control of mobile robots via receding-horizon dual heuristic programming. IEEE Trans. Cybern. 46(11), 2484–2496 (2016). https://doi.org/10.1109/TCYB.2015.2478857
Ljung, L., Hjalmarsson, H., Ohlsson, H.: Four encounters with system identification. European Journal of Control 17(5), 449–471 (2011). https://doi.org/10.3166/ejc.17.449-471, http://www.sciencedirect.com/science/article/pii/S0947358011709712
Mayne, D., Rawlings, J., Rao, C., Scokaert, P.: Constrained model predictive control: Stability and optimality. Automatica 36(6), 789–814 (2000)
Meier, L., Tanskanen, P., Heng, L., Lee, G.H., Fraundorfer, F., Pollefeys, M.: Pixhawk: a micro aerial vehicle design for autonomous flight using onboard computer vision. Auton Robots 33(1-2), 21–39 (2012). https://doi.org/10.1007/s10514-012-9281-4
Mouhacine, B.: Model-based vs data-driven adaptive control: an overview. Int. J. Adapt. Control Signal Process. 32(5), 753–776 (2017). https://doi.org/10.1002/acs.2862
Narendra, K.S., Thathachar, M.A.L.: Learning Automata: An Introduction. Prentice-Hall, Inc., Upper Saddle River (1989)
Nowé, A, Verbeeck, K., Peeters, M.: Learning automata as a basis for multi agent reinforcement learning. In: Tuyls, K., Hoen, P.J., Verbeeck, K., Sen, S (eds.) Learning and Adaption in Multi-Agent Systems, pp 71–85. Springer, Berlin (2006)
Petrović VM: Artificial intelligence and virtual worlds – toward human-level ai agents. IEEE Access 6, 39976–39988 (2018)
Raslan, H., Schwartz, H., Givigi, S.: A learning invader for the “guarding a territory” game. J. Intell. Robot. Syst. 83(1), 55–70 (2016). https://doi.org/10.1007/s10846-015-0317-9
Rossiter, J.A.: Model-Based Predictive Control: A Practical Approach, 1st edn. CRC Press LLC, Boca Raton (2004)
Samal, M.K., Garratt, M., Pota, H., Sangani, H.T.: Model predictive flight controller for longitudinal and lateral cyclic control of an unmanned helicopter. In: 2012 2nd Australian Control Conference, pp. 386–391 (2012)
dos Santos, S.R.B., Givigi, S.N., Nascimento, C.L.: Autonomous construction of multiple structures using learning automata: Description and experimental validation. IEEE Syst. J. 9(4), 1376–1387 (2015). https://doi.org/10.1109/JSYST.2014.2374334
Sidney N Givigi, J., Schwartz, H.M.: Decentralized strategy selection with learning automata for multiple pursuer–evader games. Adapt. Behav. 22(4), 221–234 (2014). https://doi.org/10.1177/1059712314526261
Thathachar, M.A.L., Arvind, M.T.: Parallel algorithms for modules of learning automata. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics) 28(1), 24–33 (1998). https://doi.org/10.1109/3477.658575
Thathachar, M.A.L., Sastry, P.S.: Networks of Learning Automata: Techniques for Online Stochastic Optimization. Springer, New York (2003)
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
The original online version of this article was revised: In this article ref. 23 was incorrect and should have been “Petrović VM: Artificial intelligence and virtual worlds – toward human-level ai agents. IEEE Access 6, 39976–39988 (2018)”
Rights and permissions
About this article
Cite this article
Jardine, P.T., Givigi, S. Improving Control Performance of Unmanned Aerial Vehicles through Shared Experience. J Intell Robot Syst 102, 68 (2021). https://doi.org/10.1007/s10846-021-01387-1
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10846-021-01387-1