Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3424636.3426894acmconferencesArticle/Chapter ViewAbstractPublication PagesmigConference Proceedingsconference-collections
research-article

Deep Integration of Physical Humanoid Control and Crowd Navigation

Published: 22 November 2020 Publication History

Abstract

Many multi-agent navigation approaches make use of simplified representations such as a disk. These simplifications allow for fast simulation of thousands of agents but limit the simulation accuracy and fidelity. In this paper, we propose a fully integrated physical character control and multi-agent navigation method. In place of sample complex online planning methods, we extend the use of recent deep reinforcement learning techniques. This extension improves on multi-agent navigation models and simulated humanoids by combining Multi-Agent and Hierarchical Reinforcement Learning. We train a single short term goal-conditioned low-level policy to provide directed walking behaviour. This task-agnostic controller can be shared by higher-level policies that perform longer-term planning. The proposed approach produces reciprocal collision avoidance, robust navigation, and emergent crowd behaviours. Furthermore, it offers several key affordances not previously possible in multi-agent navigation including tunable character morphology and physically accurate interactions with agents and the environment. Our results show that the proposed method outperforms prior methods across environments and tasks, as well as, performing well in terms of zero-shot generalization over different numbers of agents and computation time.

Supplementary Material

MP4 File (a15-haworth-video1.mp4)
MP4 File (a15-haworth-video2.mp4)
MP4 File (a15-haworth-video3.mp4)

References

[1]
Brian Allen and Petros Faloutsos. 2009a. Complex networks of simple neurons for bipedal locomotion. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems. IEEE, 4457–4462.
[2]
Brian Allen and Petros Faloutsos. 2009b. Evolved controllers for simulated locomotion. In Lecture Notes in Computer Science: Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics, Vol. 5884 LNCS. Springer, 219–230.
[3]
Glen Berseth, Mubbasir Kapadia, and Petros Faloutsos. 2015. Robust space-time footsteps for agent-based steering. Computer Animation and Virtual Worlds(2015).
[4]
Glen Berseth, Xue Bin Peng, and Michiel van de Panne. 2018. Terrain RL Simulator. CoRR abs/1804.06424(2018). arxiv:1804.06424http://arxiv.org/abs/1804.06424
[5]
Hugo Bruggeman, Wendy Zosh, and William H Warren. 2007. Optic flow drives human visuo-locomotor adaptation. Current biology 17, 23 (2007), 2035–2040.
[6]
Lucian Bu, Robert Babu, Bart De Schutter, 2008. A comprehensive survey of multiagent reinforcement learning. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews) 38, 2 (2008), 156–172.
[7]
Caroline Claus and Craig Boutilier. 1998. The dynamics of reinforcement learning in cooperative multiagent systems. AAAI/IAAI 1998(1998), 746–752.
[8]
Alain Dutech, Olivier Buffet, and François Charpillet. 2001. Multi-agent systems by incremental gradient reinforcement learning. In International Joint Conference on Artificial Intelligence, Vol. 17. Citeseer, 833–838.
[9]
Petros Faloutsos, Michiel Van de Panne, and Demetri Terzopoulos. 2001. Composable controllers for physics-based character animation. In Proceedings of the 28th Annual Conference on Computer Graphics and Interactive Techniques. ACM, 251–260.
[10]
Scott Fujimoto, Herke Van Hoof, and David Meger. 2018. Addressing function approximation error in actor-critic methods. arXiv preprint arXiv:1802.09477(2018).
[11]
Tao Geng, Bernd Porr, and Florentin Wörgötter. 2006. A reflexive neural network for dynamic biped walking control.Neural Computation 18, 5 (2006), 1156–96.
[12]
Agrim Gupta, Justin Johnson, Li Fei-Fei, Silvio Savarese, and Alexandre Alahi. 2018. Social gan: Socially acceptable trajectories with generative adversarial networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2255–2264.
[13]
Dongge Han, Wendelin Boehmer, Michael Wooldridge, and Alex Rogers. 2019. Multi-Agent Hierarchical Reinforcement Learning with Dynamic Termination. In Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems. 2006–2008.
[14]
Dirk Helbing, Illés Farkas, and Tamas Vicsek. 2000. Simulating dynamical features of escape panic. Nature 407, 6803 (2000), 487–490.
[15]
Dirk Helbing and Peter Molnar. 1995. Social force model for pedestrian dynamics. Physical review E 51, 5 (1995), 4282.
[16]
Rico Jonschkowski and Oliver Brock. 2015. Learning state representations with robotic priors. Autonomous Robots 39, 3 (2015), 407–428.
[17]
L P Kaelbling. 1993. Learning to achieve goals. In International Joint Conference on Artificial Intelligence (IJCAI), Vol. vol.2. 1094 – 8.
[18]
Mubbasir Kapadia, Nuria Pelechano, Jan Allbeck, and Norm Badler. 2015. Virtual crowds: Steps toward behavioral realism. Synthesis lectures on visual computing: computer graphics, animation, computational photography, and imaging 7, 4 (2015), 1–270.
[19]
Mubbasir Kapadia, Matt Wang, Shawn Singh, Glenn Reinman, and Petros Faloutsos. 2011. Scenario space: characterizing coverage, quality, and failure of steering algorithms. In Proceedings of the 2011 ACM SIGGRAPH/Eurographics Symposium on Computer Animation. ACM, 53–62.
[20]
Ioannis Karamouzas, Peter Heil, Pascal Van Beek, and Mark H Overmars. 2009. A predictive collision avoidance model for pedestrian simulation. In International workshop on motion in games. Springer, 41–52.
[21]
Sujeong Kim, StephenJ. Guy, Karl Hillesland, Basim Zafar, AdnanAbdul-Aziz Gutub, and Dinesh Manocha. 2014. Velocity-based modeling of physical interactions in dense crowds. The Visual Computer (2014), 1–15. https://doi.org/10.1007/s00371-014-0946-1
[22]
Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980(2014).
[23]
Andrew Kun and W. Thomas Miller III. 1996. Adaptive dynamic balance of a biped robot using neural networks. In Proceedings of the IEEE International Conference on Robotics and Automation, Vol. pages. IEEE, 240–245.
[24]
Jaedong Lee, Jungdam Won, and Jehee Lee. 2018. Crowd simulation by deep reinforcement learning. In Proceedings of the 11th Annual International Conference on Motion, Interaction, and Games. ACM, 2.
[25]
Michael L Littman. 1994. Markov games as a framework for multi-agent reinforcement learning. In Machine learning proceedings 1994. Elsevier, 157–163.
[26]
Ryan Lowe, YI WU, Aviv Tamar, Jean Harb, OpenAI Pieter Abbeel, and Igor Mordatch. 2017. Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments. In Advances in Neural Information Processing Systems 30. 6379–6390.
[27]
Francisco Martinez-Gil, Miguel Lozano, and Fernando Fernández. 2015. Strategies for simulating pedestrian navigation with multiple reinforcement learning agents. Autonomous Agents and Multi-Agent Systems 29, 1 (2015), 98–130.
[28]
Josh Merel, Arun Ahuja, Vu Pham, Saran Tunyasuvunakool, Siqi Liu, Dhruva Tirumala, Nicolas Heess, and Greg Wayne. 2018. Hierarchical visuomotor control of humanoids. CoRR abs/1811.09656(2018). arxiv:1811.09656http://arxiv.org/abs/1811.09656
[29]
W. Thomas Miller III. 1994. Real-time neural network control of a biped walking robot. Control Systems, IEEE 14, 1 (1994), 41–48.
[30]
Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A Rusu, Joel Veness, Marc G Bellemare, Alex Graves, Martin Riedmiller, Andreas K Fidjeland, Georg Ostrovski, 2015. Human-level control through deep reinforcement learning. Nature 518, 7540 (2015), 529.
[31]
Ranjit Nair, Milind Tambe, Makoto Yokoo, David Pynadath, and Stacy Marsella. 2003. Taming decentralized POMDPs: Towards efficient policy computation for multiagent settings. In IJCAI, Vol. 3. 705–711.
[32]
OpenAI. 2018. OpenAI Five. https://blog.openai.com/openai-five/.
[33]
Nuria Pelechano, Jan M Allbeck, Mubbasir Kapadia, and Norman I Badler. 2016. Simulating heterogeneous crowds with interactive behaviors. CRC Press.
[34]
Xue Bin Peng, Glen Berseth, KangKang Yin, and Michiel Van De Panne. 2017. Deeploco: Dynamic locomotion skills using hierarchical deep reinforcement learning. ACM Transactions on Graphics (TOG) 36, 4 (2017), 41.
[35]
John Schulman, Sergey Levine, Pieter Abbeel, Michael Jordan, and Philipp Moritz. 2015. Trust region policy optimization. In International conference on machine learning. 1889–1897.
[36]
John Schulman, Philipp Moritz, Sergey Levine, Michael Jordan, and Pieter Abbeel. 2016. High-dimensional continuous control using generalized advantage estimation. In International Conference on Learning Representations (ICLR 2016).
[37]
J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov. 2017. Proximal Policy Optimization Algorithms. ArXiv e-prints (July 2017). arxiv:1707.06347 [cs.LG]
[38]
John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. 2017. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347(2017).
[39]
David Silver, Guy Lever, Nicolas Heess, Thomas Degris, Daan Wierstra, and Martin Riedmiller. 2014. Deterministic policy gradient algorithms. In Proc. ICML.
[40]
David Silver, Julian Schrittwieser, Karen Simonyan, Ioannis Antonoglou, Aja Huang, Arthur Guez, Thomas Hubert, Lucas Baker, Matthew Lai, Adrian Bolton, and et al.2017. Mastering the game of Go without human knowledge. Nature 550, 7676 (Oct 2017), 354–359.
[41]
Shawn Singh, Mubbasir Kapadia, Glenn Reinman, and Petros Faloutsos. 2011. Footstep navigation for dynamic crowds. Computer Animation and Virtual Worlds 22, 2-3 (2011), 151–158.
[42]
Gentaro Taga, Yoko Yamaguchi, and Hiroshi Shinizu. 1991. Self-organized control of bipedal locomotion by neural oscillators in unpredicatable environments. Biological Cybernetics 65, 3 (1991), 147–159.
[43]
Ming Tan. 1993. Multi-agent reinforcement learning: Independent vs. cooperative agents. In Proceedings of the tenth international conference on machine learning. 330–337.
[44]
Hongyao Tang, Jianye Hao, Tangjie Lv, Yingfeng Chen, Zongzhang Zhang, Hangtian Jia, Chunxu Ren, Yan Zheng, Changjie Fan, and Li Wang. 2018. Hierarchical deep multiagent reinforcement learning. arXiv preprint arXiv:1809.09332(2018).
[45]
Daniel Thalmann and Soraia Raupp Musse. 2013. . Springer.
[46]
Lisa Torrey. 2010. Crowd Simulation via Multi-agent Reinforcement Learning. In Proceedings of the Sixth AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment(Stanford, California, USA) (AIIDE’10). AAAI Press, 89–94.
[47]
Michael R Tucker, Jeremy Olivier, Anna Pagel, Hannes Bleuler, Mohamed Bouri, Olivier Lambercy, José del R Millán, Robert Riener, Heike Vallery, and Roger Gassert. 2015. Control strategies for active lower extremity prosthetics and orthotics: a review. Journal of neuroengineering and rehabilitation 12, 1(2015), 1.
[48]
Jur Van Den Berg, Stephen J Guy, Ming Lin, and Dinesh Manocha. 2011. Reciprocal n-body collision avoidance. In Robotics research. Springer, 3–19.
[49]
Jur Van den Berg, Ming Lin, and Dinesh Manocha. 2008. Reciprocal velocity obstacles for real-time multi-agent navigation. In 2008 IEEE International Conference on Robotics and Automation. IEEE, 1928–1935.
[50]
William H Warren Jr, Bruce A Kay, Wendy D Zosh, Andrew P Duchon, and Stephanie Sahuc. 2001. Optic flow is used to control human walking. Nature neuroscience 4, 2 (2001), 213.
[51]
Manuel Watter, Jost Springenberg, Joschka Boedecker, and Martin Riedmiller. 2015. Embed to control: A locally linear latent dynamics model for control from raw images. In Advances in neural information processing systems. 2746–2754.
[52]
David Wilkie, Jur Van Den Berg, and Dinesh Manocha. 2009. Generalized velocity obstacles. In 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems. IEEE, 5573–5578.
[53]
KangKang Yin, Kevin Loken, and Michiel van de Panne. 2007. SIMBICON: Simple Biped Locomotion Control. ACM Transactions on Graphics 26, 3 (2007), Article 105.
[54]
Petr Zaytsev, S Javad Hasaneini, and Andy Ruina. 2015. Two steps is enough: no need to plan far ahead for walking balance. In 2015 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 6295–6300.
[55]
Amy Zhang, Nicolas Ballas, and Joelle Pineau. 2018. A dissection of overfitting and generalization in continuous reinforcement learning. arXiv preprint arXiv:1806.07937(2018).

Cited By

View all
  • (2024)UAV-Assisted NOMA Network Power Allocation Under Offshore Multi-energy Complementary Power Generation SystemProceedings of the 13th International Conference on Computer Engineering and Networks10.1007/978-981-99-9239-3_12(125-136)Online publication date: 4-Jan-2024
  • (2024)Crowd evacuation simulation based on hierarchical agent model and physics‐based character controlComputer Animation and Virtual Worlds10.1002/cav.226335:3Online publication date: 27-May-2024
  • (2023)MAAIPProceedings of the ACM on Computer Graphics and Interactive Techniques10.1145/36069266:3(1-20)Online publication date: 24-Aug-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
MIG '20: Proceedings of the 13th ACM SIGGRAPH Conference on Motion, Interaction and Games
October 2020
190 pages
ISBN:9781450381710
DOI:10.1145/3424636
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 November 2020

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Crowd Simulation
  2. Multi-Agent Learning
  3. Physics-based Simulation

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

  • Ontario Research Fund
  • NSERC

Conference

MIG '20
Sponsor:
MIG '20: Motion, Interaction and Games
October 16 - 18, 2020
SC, Virtual Event, USA

Acceptance Rates

Overall Acceptance Rate -9 of -9 submissions, 100%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)63
  • Downloads (Last 6 weeks)7
Reflects downloads up to 20 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)UAV-Assisted NOMA Network Power Allocation Under Offshore Multi-energy Complementary Power Generation SystemProceedings of the 13th International Conference on Computer Engineering and Networks10.1007/978-981-99-9239-3_12(125-136)Online publication date: 4-Jan-2024
  • (2024)Crowd evacuation simulation based on hierarchical agent model and physics‐based character controlComputer Animation and Virtual Worlds10.1002/cav.226335:3Online publication date: 27-May-2024
  • (2023)MAAIPProceedings of the ACM on Computer Graphics and Interactive Techniques10.1145/36069266:3(1-20)Online publication date: 24-Aug-2023
  • (2023)Simulation and Retargeting of Complex Multi-Character InteractionsACM SIGGRAPH 2023 Conference Proceedings10.1145/3588432.3591491(1-11)Online publication date: 23-Jul-2023
  • (2023)Heterogeneous Crowd Simulation Using Parametric Reinforcement LearningIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2021.313903129:4(2036-2052)Online publication date: 1-Apr-2023
  • (2023)Multi Agent Navigation in Unconstrained Environments using a Centralized Attention based Graphical Neural Network Controller2023 IEEE 26th International Conference on Intelligent Transportation Systems (ITSC)10.1109/ITSC57777.2023.10422072(2893-2900)Online publication date: 24-Sep-2023
  • (2023)Trace and Pace: Controllable Pedestrian Animation via Guided Trajectory Diffusion2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52729.2023.01322(13756-13766)Online publication date: Jun-2023
  • (2023)Human Motion Synthesis Using Trigonometric SplinesIEEE Access10.1109/ACCESS.2023.324406211(14293-14308)Online publication date: 2023
  • (2022)Physics-based character controllers using conditional VAEsACM Transactions on Graphics10.1145/3528223.353006741:4(1-12)Online publication date: 22-Jul-2022
  • (2022)Authoring Virtual Crowds: A SurveyComputer Graphics Forum10.1111/cgf.1450641:2(677-701)Online publication date: 24-May-2022
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media