Abstract
In this paper we introduce a novel algorithm for online distributed non-myopic task-selection in heterogeneous robotic teams. Our algorithm uses a temporal probabilistic representation that allows agents to evaluate their actions in the team’s joint action space while robots individually search their own action space. We use Monte-Carlo tree search to asymmetrically search through the robot’s individual action space while accounting for the probable future actions of their team members using the condensed temporal representation. This allows a distributed team of robots to non-myopically coordinate their actions in real-time. Our developed method can be applied across a wide range of tasks, robot team compositions, and reward functions. To evaluate our coordination method, we implemented it for a series of simulated and fielded hardware trials where we found that our coordination method is able to increase the cumulative team reward by a maximum of \(47.2\%\) in the simulated trials versus a distributed auction-based coordination. We also performed several outdoor hardware trials with a team of three quadcopters that increased the maximum cumulative reward by \(24.5\%\) versus a distributed auction-based coordination.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
The hardware flight trials conducted for this research were conducted under Oregon State University’s (OSU) Federal Aviation Administration Certificate of Authorization and logged in OSU’s compliance software, Drone Complier.
References
Agrawal, R. (1995). Sample mean based index policies by O(log n) regret for the multi-armed bandit problem. Advances in Applied Probability, 27(4), 1054–1078.
Amato, C., Konidaris, G., Anders, A., Cruz, G., How, J. P., & Kaelbling, L. P. (2016). Policy search for multi-robot coordination under uncertainty. International Journal of Robotics Research, 35(14), 1760–1778.
Arkin, R. C., & Balch, T. (1998). Cooperative multiagent robotic systems. In Artificial intelligence and mobile robots.
Ayanian, N., Fitch, R., Franchi, A., & Sabattini, L. (2017). Multirobot systems. IEEE Robotics & Automation Magazine, 24(2), 12–16.
Beck, Z., Teacy, L., Rogers, A., & Jennings, N. R. (2016). Online planning for collaborative search and rescue by heterogeneous robot teams. In Proceedings of the international conference on autonomous agents & multiagent systems (pp. 1024–1033).
Bektas, T. (2006). The multiple traveling salesman problem: An overview of formulations and solution procedures. Omega, 34(3), 209–219.
Best, G., Cliff, O., Patten, T., Mettu, R. R., & Fitch, R. (2018). Dec-MCTS: Decentralized planning for multi-robot active perception. International Journal of Robotics Research. https://doi.org/10.1177/0278364918755924.
Best, G., Forrai, M., Mettu, R. R., & Fitch, R. (2018). Planning-aware communication for decentralised multi-robot coordination. In Proceedings of the IEEE International Conference on Robotics and Automation.
Best, G., Huang, S., & Fitch, R. (2018). Decentralised mission monitoring with spatiotemporal optimal stopping. In Proceedings of the IEEE/RSJ international conference on intelligent robots and systems.
Best, G., Faigl, J., & Fitch, R. (2018). Online planning for multi-robot active perception with self-organising maps. Autonomous Robots, 42(4), 715–738.
Boddy, M., & Dean, T. L. (1989). Solving time-dependent planning problems. Providence: Department of Computer Science, Brown University.
Browne, C. B., Powley, E., Whitehouse, D., Lucas, S. M., Cowling, P. I., Rohlfshagen, P., et al. (2012). A survey of Monte Carlo tree search methods. IEEE Transactions on Computational Intelligence and AI in Games, 4(1), 1–43.
Chang, H. S., Fu, M. C., Hu, J., & Marcus, S. I. (2005). An adaptive sampling algorithm for solving Markov decision processes. Operations Research, 53(1), 126–139.
Choi, H., Brunet, L., & How, J. P. (2009). Consensus-based decentralized auctions for robust task allocation. IEEE Transactions on Robotics, 25(4), 912–926.
Chopra, S., Notarstefano, G., Rice, M., & Egerstedt, M. (2017). Distributed version of the Hungarian method for a multirobot assignment. IEEE Transactions on Robotics, 33(4), 932–947.
Corah, M., & Michael, N. (2018). Distributed matroid-constrained submodular maximization for multi-robot exploration: Theory and practice. Autonomous Robots. https://doi.org/10.1007/s10514-018-9778-6.
Deng, Q., Yu, J., & Wang, N. (2013). Cooperative task assignment of multiple heterogeneous unmanned aerial vehicles using a modified genetic algorithm with multi-type genes. Chinese Journal of Aeronautics, 26(5), 1238–1250.
Desrosiers, J., Dumas, Y., Solomon, M. M., & Soumis, F. (1995). Chapter 2: Time constrained routing and scheduling. In M. O. Ball, T. L. Magnanti, C. L. Monma, & G. L. Nemhauser (Eds.), Handbooks in operations research and management science (Vol. 8, pp. 35–139). Elsevier
DJI. Matrice 100 quadcopter for developers. https://www.dji.com/matrice100.
Faigl, J., Kulich, M., & Preucil, L. (2012). Goal assignment using distance cost in multi-robot exploration. In: Proceedings of the IEEE International Conference on Intelligent Robots and Systems (pp. 3741–3746).
Fukuda, T., Nakagawa, S., Kawauchi, Y., & Buss, M. (1989). Structure decision for self organising robots based on cell structures. In Proceedings of the IEEE international conference on robotics and automation.
Garivier, A., & Moulines, E. (2011). On upper-confidence bound policies for switching bandit problems. In Proceedings of the international conference on algorithmic learning theory (pp. 174–188). Berlin: Springer.
Gerkey, B. P., & Mataric, M. J. (2002). Sold!: Auction methods for multirobot coordination. IEEE Transactions on Robotics and Automation, 18(5), 758–768.
Hart, P. E., Nilsson, N. J., & Raphael, B. (1968). A formal basis for the heuristic determination of minimum cost paths. IEEE transactions on Systems Science and Cybernetics, 4(2), 100–107.
Karp, R.M. (1992). On-line algorithms versus off-line algorithms: How much is it worth to know the future? In IFIP Congress (1) (Vol. 12, pp. 416–429).
Kartal, B., Godoy, J., Karamouzas, I., & Guy, S. J. (2015). Stochastic tree search with useful cycles for patrolling problems. In Proceedings of the IEEE international conference on robotics and automation (ICRA) (pp. 1289–1294).
Kocsis, L., & Szepesvári, C. (2006). Bandit based Monte-Carlo planning. In Proceedings of the European conference on machine learning (pp. 282–293).
Labbé, M., & Michaud, F. (2014). Online global loop closure detection for large-scale multi-session graph-based SLAM. In Proceedings of the IEEE/RSJ international conference on intelligent robots and systems (pp. 2661–2666).
Lan, X., & Schwager, M. (2016). Rapidly exploring random cycles: Persistent estimation of spatiotemporal fields with multiple sensing robots. IEEE Transactions on Robotics, 32(5), 1230–1244.
Liu, Y., & Chopra, N. (2009). Controlled synchronization of robotic manipulators in the task space. In Proceedings of the ASME dynamic systems and control conference (pp. 443–450).
Liu, L., Michael, N., & Shell, D. A. (2015). Communication constrained task allocation with optimized local task swaps. Autonomous Robots, 39(3), 429–444.
Liu, L., & Shell, D. A. (2012). Large-scale multi-robot task allocation via dynamic partitioning and distribution. Autonomous Robots, 33(3), 291–307.
Luo, L., Chakraborty, N., & Sycara, K. (2012). Competitive analysis of repeated greedy auction algorithm for online multi-robot task assignment. In Proceedings of the IEEE international conference on robotics and automation (pp. 4792–4799).
Mathew, N., Smith, S., & Waslander, S. (2015). Multirobot rendezvous planning for recharging in persistent tasks. IEEE Transactions on Robotics, 31(1), 128–142.
Meyer, J., Sendobry, A., Kohlbrecher, S., Klingauf, U., & von Stryk, O. (2012). Comprehensive simulation of quadrotor UAVs using ROS and Gazebo. In Proceedings of the international conference on simulation, modeling and programming for autonomous robots (SIMPAR).
Mills-Tettey, G. A., Stentz, A., & Dias, M. B. (2007). The dynamic Hungarian algorithm for the assignment problem with changing costs. Technical Report, Carnegie Mellon University.
Mitrović-Minić, S., Krishnamurti, R., & Laporte, G. (2004). Double-horizon based heuristics for the dynamic pickup and delivery problem with time windows. Transportation Research Part B: Methodological, 38(8), 669–685.
O. S. R. Foundation. Robot operating system. http://www.ros.org/. Accessed 7 February 2017.
O. S. R. Foundation: Gazebo: Robot simulation made easy. http://gazebosim.org/. Accessed 13 October 2017.
Omidshafiei, S., Agha-Mohammadi, A., Amato, C., Liu, S., How, J. P., & Vian, J. (2017). Decentralized control of multi-robot partially observable Markov decision processes using belief space macro-actions. International Journal of Robotics Research, 36(2), 231–258.
Petersen, K., Kleiner, A., & von Stryk, O. (2013). Fast task-sequence allocation for heterogeneous robot teams with a human in the loop. In Proceedings of the international conference on intelligent robots and systems (pp. 1648–1655).
Schaefers, L., & Platzner, M. (2015). Distributed Monte Carlo tree search: A novel technique and its application to computer Go. IEEE Transactions on Computational Intelligence and AI in Games, 7(4), 361–374.
Shin, K. G., & Ramanathan, P. (1994). Real-time computing: A new discipline of computer science and engineering. Proceedings of the IEEE, 82(1), 6–24.
Smith, A. Github/smithan7. https://github.com/smithan7.
Smith, R. N., Das, E. C., Heidarsson, H., Pereira, A. M., Arrichiello, F., Cetnic, I., et al. (2010). Usc CINAPS builds bridges. IEEE Robotics & Automation Magazine, 17(1), 20–30.
Tumer, K., & Agogino, A. (2009). Multiagent learning for black box system reward functions. Advances in Complex Systems, 12(04n05), 475–492.
Yoshizoe, K., Kishimoto, A., Kaneko, T., Yoshimoto, H., & Ishikawa, Y. (2011). Scalable distributed Monte-Carlo tree search. In Proceedings of the fourth annual symposium on combinatorial search.
Yu, J., Schwager, M., & Rus, D. (2016). Correlated orienteering problem and its application to persistent monitoring tasks. IEEE Transactions on Robotics, 32(5), 1106–1118.
Zavlanos, M. M., Spesivtsev, L., & Pappas, G. J. (2008). A distributed auction algorithm for the assignment problem. In Proceedings of the IEEE conference on decision and control (pp. 1212–1217).
Zheng, X., & Koenig, S. (2009). K-swaps: Cooperative negotiation for solving task-allocation problems. In: Proceedings of the international joint conference on artificial intelligence (Vol. 9, pp. 373–378).
Zlot, R., Stentz, A., Dias, M. B., & Thayer, S. (2002). Multi-robot exploration controlled by a market economy. In Proceedings of the IEEE international conference on robotics and automation (Vol. 3, pp. 3016–3023).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This is one of the several papers published in Autonomous Robots comprising the Special Issue on Foundations of Resilience for Networked Robotic Systems.
This work was supported by NASA Grant NNX14AI10G and Office of Naval Research Grant N00014-17-1-2581.
Rights and permissions
About this article
Cite this article
Smith, A.J., Best, G., Yu, J. et al. Real-time distributed non-myopic task selection for heterogeneous robotic teams. Auton Robot 43, 789–811 (2019). https://doi.org/10.1007/s10514-018-9811-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10514-018-9811-9