Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3240508.3243653acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

VIVID: Virtual Environment for Visual Deep Learning

Published: 15 October 2018 Publication History

Abstract

Due to the advances in deep reinforcement learning and the demand of large training data, virtual-to-real learning has gained lots of attention from computer vision community recently. As state-of-the-art 3D engines can generate photo-realistic images suitable for training deep neural networks, researchers have been gradually applied 3D virtual environment to learn different tasks including autonomous driving, collision avoidance, and image segmentation, to name a few. Although there are already many open-source simulation environments readily available, most of them either provide small scenes or have limited interactions with objects in the environment. To facilitate visual recognition learning, we present a new Virtual Environment for Visual Deep Learning (VIVID), which offers large-scale diversified indoor and outdoor scenes. Moreover, VIVID leverages the advanced human skeleton system, which enables us to simulate numerous complex human actions. VIVID has a wide range of applications and can be used for learning indoor navigation, action recognition, event detection, etc. We also release several deep learning examples in Python to demonstrate the capabilities and advantages of our system.

References

[1]
Iro Armeni, Ozan Sener, Amir R Zamir, Helen Jiang, Ioannis Brilakis, Martin Fischer, and Silvio Savarese. 2016. 3D Semantic Parsing of Large-scale Indoor Spaces. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1534--1543.
[2]
Charles Beattie, Joel Z Leibo, Denis Teplyashin, Tom Ward, Marcus Wainwright, Heinrich Küttler, Andrew Lefrancq, Simon Green, Víctor Valdés, Amir Sadik, et al. 2016. DeepMind Lab. arXiv preprint arXiv:1612.03801 (2016).
[3]
Greg Brockman, Vicki Cheung, Ludwig Pettersson, Jonas Schneider, John Schulman, Jie Tang, and Wojciech Zaremba. 2016. OpenAI Gym. arXiv preprint arXiv:1606.01540 (2016).
[4]
Simon Brodeur, Ethan Perez, Ankesh Anand, Florian Golemo, Luca Celotti, Florian Strub, Jean Rouat, Hugo Larochelle, and Aaron Courville. 2017. HoME: A Household Multimodal Environment. arXiv preprint arXiv:1711.11017 (2017).
[5]
César Roberto de Souza, Adrien Gaidon, Yohann Cabon, and AM López Pena. 2017. Procedural generation of videos to train deep action recognition networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Vol. 2.
[6]
DeepDrive. 2018. DeepDrive. http://deepdrive.io
[7]
Epic Games. 2004. Unreal Engine. http://www.unrealengine.com
[8]
Saurabh Gupta, James Davidson, Sergey Levine, Rahul Sukthankar, and Jitendra Malik. 2017. Cognitive Mapping and Planning for Visual Navigation. CVPR 3 (2017).
[9]
Zhang-Wei Hong, Chen Yu-Ming, Shih-Yang Su, Tzu-Yun Shann, Yi-Hsiang Chang, Hsuan-Kung Yang, Brian Hsi-Lin Ho, Chih-Chieh Tu, Yueh-Chuan Chang, Tsu-Ching Hsiao, Hsin-Wei Hsiao, Sih-Pin Lai, and Chun-Yi Lee. 2018. Virtual-to-Real: Learning to Control in Visual Semantic Segmentation. IJCAI (2018).
[10]
Matterport Inc. 2011. Matterport 3D. http://matterport.com/
[11]
Michał Kempka, Marek Wydmuch, Grzegorz Runc, Jakub Toczek, and Wojciech Jaıkowski. 2016. VizDoom: A Doom-based AI Research Platform for Visual Reinforcement Learning. In Computational Intelligence and Games (CIG), 2016 IEEE Conference on. IEEE, 1--8.
[12]
Mark Martinez, Chawin Sitawarin, Kevin Finch, Lennart Meincke, Alex Yablonski, and Alain L. Kornhauser. 2017. Beyond Grand Theft Auto V for Training, Testing and Enhancing Deep Learning in Self Driving Cars. CoRR abs/1712.01397 (2017). arXiv:1712.01397 http://arxiv.org/abs/1712.01397
[13]
Microsoft. 2015. Project Malmo. http://www.microsoft.com/en-us/research/project/project-malmo/
[14]
Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A Rusu, Joel Veness, Marc G Bellemare, Alex Graves, Martin Riedmiller, Andreas K Fidjeland, Georg Ostrovski, et al. 2015. Human-level Control through Deep Reinforcement Learning. Nature 518, 7540 (2015), 529--533.
[15]
Matthias Müller, Vincent Casser, Jean Lahoud, Neil Smith, and Bernard Ghanem. 2018. Sim4CV: A Photo-Realistic Simulator for Computer Vision Applications. International Journal of Computer Vision (2018), 1--18.
[16]
Xinlei Pan, Yurong You, Ziyan Wang, and Cewu Lu. 2017. Virtual to Real Reinforcement Learning for Autonomous Driving. BMVC (2017).
[17]
Weichao Qiu, Fangwei Zhong, Yi Zhang, Siyuan Qiao, Zihao Xiao, Tae Soo Kim, and Yizhou Wang. 2017. UnrealCV: Virtual Worlds for Computer Vision. In Proceedings of the 2017 ACM on Multimedia Conference. ACM, 1221--1224.
[18]
Fereshteh Sadeghi and Sergey Levine. 2016. (CAD) 2 RL: Real Single-Image Flight without a Single Real Image. arXiv preprint arXiv:1611.04201 (2016).
[19]
Manolis Savva, Angel X Chang, Alexey Dosovitskiy, Thomas Funkhouser, and Vladlen Koltun. 2017. MINOS: Multimodal Indoor Simulator for Navigation in Complex Environments. arXiv preprint arXiv:1712.03931 (2017).
[20]
Shital Shah, Debadeepta Dey, Chris Lovett, and Ashish Kapoor. 2018. AirSim: High-fidelity Visual and Physical Simulation for Autonomous Vehicles. In Field and Service Robotics. Springer, 621--635.
[21]
Adobe Systems. 2015. Mixamo. http://www.mixamo.com
[22]
Unity Technologies. 2004. Unity. http://unity.com
[23]
Unity. 2017. Unity Machine Learning. http://unity3d.com/machine-learning
[24]
Yi Wu, Yuxin Wu, Georgia Gkioxari, and Yuandong Tian. 2018. Building Generalizable Agents with a Realistic and Rich 3D Environment. arXiv preprint arXiv:1801.02209 (2018).
[25]
Bernhard Wymann, Eric Espié, Christophe Guionneau, Christos Dimitrakakis, Rémi Coulom, and Andrew Sumner. 2000. Torcs, the Open Racing Car Simulator. Software available at http://torcs. sourceforge. net 4 (2000).
[26]
Yuke Zhu, Roozbeh Mottaghi, Eric Kolve, Joseph J Lim, Abhinav Gupta, Li Fei-Fei, and Ali Farhadi. 2017. Target-driven Visual Navigation in Indoor Scenes using Deep Reinforcement Learning. In Robotics and Automation (ICRA), 2017 IEEE International Conference on. IEEE, 3357--3364.

Cited By

View all
  • (2024)Artificial Intelligence for Web 3.0: A Comprehensive SurveyACM Computing Surveys10.1145/365728456:10(1-39)Online publication date: 14-May-2024
  • (2024)BC4LLM: A perspective of trusted artificial intelligence when blockchain meets large language modelsNeurocomputing10.1016/j.neucom.2024.128089599(128089)Online publication date: Sep-2024
  • (2024)Integrating synthetic datasets with CLIP semantic insights for single image localization advancementsISPRS Journal of Photogrammetry and Remote Sensing10.1016/j.isprsjprs.2024.10.027218(198-213)Online publication date: Dec-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
MM '18: Proceedings of the 26th ACM international conference on Multimedia
October 2018
2167 pages
ISBN:9781450356657
DOI:10.1145/3240508
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 October 2018

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. autonomous navigation
  2. deep learning
  3. event detection
  4. virtual reality
  5. visual recognition

Qualifiers

  • Research-article

Conference

MM '18
Sponsor:
MM '18: ACM Multimedia Conference
October 22 - 26, 2018
Seoul, Republic of Korea

Acceptance Rates

MM '18 Paper Acceptance Rate 209 of 757 submissions, 28%;
Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)34
  • Downloads (Last 6 weeks)6
Reflects downloads up to 10 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Artificial Intelligence for Web 3.0: A Comprehensive SurveyACM Computing Surveys10.1145/365728456:10(1-39)Online publication date: 14-May-2024
  • (2024)BC4LLM: A perspective of trusted artificial intelligence when blockchain meets large language modelsNeurocomputing10.1016/j.neucom.2024.128089599(128089)Online publication date: Sep-2024
  • (2024)Integrating synthetic datasets with CLIP semantic insights for single image localization advancementsISPRS Journal of Photogrammetry and Remote Sensing10.1016/j.isprsjprs.2024.10.027218(198-213)Online publication date: Dec-2024
  • (2024)Introduction of Web 3.0Security and Privacy in Web 3.010.1007/978-981-97-5752-7_1(1-14)Online publication date: 10-Jul-2024
  • (2023)ThermalSynth: A Novel Approach for Generating Synthetic Thermal Human Scenarios2023 IEEE/CVF Winter Conference on Applications of Computer Vision Workshops (WACVW)10.1109/WACVW58289.2023.00018(130-139)Online publication date: Jan-2023
  • (2023)AI Wings: An AIoT Drone System for Commanding ArduPilot UAVsIEEE Systems Journal10.1109/JSYST.2022.318901117:2(2213-2224)Online publication date: Jun-2023
  • (2023)Augmented Reality for Real Object Detection2023 International Conference on Consumer Electronics - Taiwan (ICCE-Taiwan)10.1109/ICCE-Taiwan58799.2023.10227067(803-804)Online publication date: 17-Jul-2023
  • (2023)The Role of Artificial Intelligence and Robotic Solution Technologies in Metaverse DesignMetaverse10.1007/978-981-99-4641-9_4(45-63)Online publication date: 13-Oct-2023
  • (2023)How to Enrich Metaverse? Blockchains, AI, and Digital TwinFrom Blockchain to Web3 & Metaverse10.1007/978-981-99-3648-9_2(27-61)Online publication date: 25-May-2023
  • (2022)Fusing Blockchain and AI With Metaverse: A SurveyIEEE Open Journal of the Computer Society10.1109/OJCS.2022.31882493(122-136)Online publication date: 2022
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media