Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3688862.3689108acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article
Open access

Enhancing End-to-End Autonomous Driving Systems Through Synchronized Human Behavior Data

Published: 28 October 2024 Publication History

Abstract

This paper presents a pioneering exploration into the integration of fine-grained human supervision within the autonomous driving domain to enhance system performance. The current advances in End-to-End autonomous driving normally are data-driven and rely on given expert trials. However, this reliance limits the systems' generalizability and their ability to earn human trust. Addressing this gap, our research introduces a novel approach by synchronously collecting data from human and machine drivers under identical driving scenarios, focusing on eye-tracking and brainwave data to guide machine perception and decision-making processes. This paper utilizes the Carla simulation to evaluate the impact brought by human behavior guidance. Experimental results show that using human attention to guide machine attention could bring a significant improvement in driving performance. However, guidance by human intention still remains a challenge. This paper pioneers a promising direction and potential for utilizing human behavior guidance to enhance autonomous systems.

References

[1]
Yaman Albadawi, Maen Takruri, and Mohammed Awad. 2022. A review of recent developments in driver drowsiness detection systems. Sensors, Vol. 22, 5 (2022), 2069.
[2]
Emad Alyan, Stefan Arnau, Julian Elias Reiser, Stephan Getzmann, Melanie Karthaus, and Edmund Wascher. 2023. Blink-related EEG activity measures cognitive load during proactive and reactive driving. Scientific Reports, Vol. 13, 1 (2023), 19379.
[3]
Zehong Cao, Chun-Hsiang Chuang, Jung-Kai King, and Chin-Teng Lin. 2019. Multi-channel EEG recordings during a sustained-attention driving task. Scientific data, Vol. 6, 1 (2019), 19.
[4]
Felipe Codevilla, Matthias Müller, Antonio López, Vladlen Koltun, and Alexey Dosovitskiy. 2018. End-to-end driving via conditional imitation learning. In IEEE International Conference on Robotics and Automation. IEEE, 4693--4700.
[5]
Felipe Codevilla, Eder Santana, Antonio M López, and Adrien Gaidon. 2019. Exploring the limitations of behavior cloning for autonomous driving. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 9329--9338.
[6]
Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, et al. 2020. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020).
[7]
Yiqun Duan, Charles Chau, Zhen Wang, Yu-Kai Wang, and Chin-teng Lin. 2024. Dewave: Discrete encoding of eeg waves for eeg to text translation. Advances in Neural Information Processing Systems, Vol. 36 (2024).
[8]
Yiqun Duan, Xianda Guo, Zheng Zhu, Zhen Wang, Yu-Kai Wang, and Chin-Teng Lin. 2024. MaskFuser: Masked Fusion of Joint Multi-Modal Tokenization for End-to-End Autonomous Driving. arXiv preprint arXiv:2405.07573 (2024).
[9]
Yiqun Duan, Qiang Zhang, and Renjing Xu. 2024. Prompting Multi-Modal Tokens to Enhance End-to-End Autonomous Driving Imitation Learning with LLMs. arXiv preprint arXiv:2404.04869 (2024).
[10]
Prakash et al. [n.,d.]. Multi-Modal Fusion Transformer for End-to-End Autonomous Driving. In CVPR2021. 7077--7087.
[11]
Deepak Gopinath, Guy Rosman, Simon Stent, Katsuya Terahata, Luke Fletcher, Brenna Argall, and John Leonard. 2021. Maad: A model and dataset for" attended awareness" in driving. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 3426--3436.
[12]
Shane Griffith, Kaushik Subramanian, Jonathan Scholz, Charles L Isbell, and Andrea L Thomaz. 2013. Policy shaping: Integrating human feedback with reinforcement learning. Advances in neural information processing systems, Vol. 26 (2013).
[13]
Jonathan Ho and Stefano Ermon. 2016. Generative adversarial imitation learning. Advances in neural information processing systems, Vol. 29 (2016).
[14]
Junjie Huang, Guan Huang, Zheng Zhu, and Dalong Du. 2021. Bevdet: High-performance multi-camera 3d object detection in bird-eye-view. arXiv preprint arXiv:2112.11790 (2021).
[15]
W Bradley Knox and Peter Stone. 2011. Augmenting reinforcement learning with human feedback. In ICML 2011 Workshop on New Developments in Imitation Learning (July 2011), Vol. 855. 3.
[16]
Alex H Lang, Sourabh Vora, Holger Caesar, Lubing Zhou, Jiong Yang, and Oscar Beijbom. 2019. Pointpillars: Fast encoders for object detection from point clouds. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 12697--12705.
[17]
Zhijian Liu, Haotian Tang, Alexander Amini, Xinyu Yang, Huizi Mao, Daniela Rus, and Song Han. 2022. BEVFusion: Multi-Task Multi-Sensor Fusion with Unified Bird's-Eye View Representation. arXiv preprint arXiv:2205.13542 (2022).
[18]
Ilya Loshchilov and Frank Hutter. 2016. Sgdr: Stochastic gradient descent with warm restarts. arXiv preprint arXiv:1608.03983 (2016).
[19]
Ilya Loshchilov and Frank. Hutter. 2017. Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101 (2017).
[20]
Mahdi Rezaei and Reinhard Klette. 2014. Look at the driver, look at the road: No distraction! no accident!. In Proceedings of the IEEE conference on computer vision and pattern recognition. 129--136.
[21]
Avishkar Saha, Oscar Mendez, Chris Russell, and Richard Bowden. 2022. Translating images into maps. In 2022 International Conference on Robotics and Automation (ICRA). IEEE, 9200--9206.
[22]
Stefan Schaal. 1999. Is imitation learning the route to humanoid robots? Trends in cognitive sciences, Vol. 3, 6 (1999), 233--242.
[23]
Hao Shao, Letian Wang, Ruobing Chen, Hongsheng Li, and Yu Liu. 2022. Safety-enhanced autonomous driving using interpretable sensor fusion transformer. arXiv preprint arXiv:2207.14024 (2022).
[24]
Hao Shao, Letian Wang, Ruobing Chen, Hongsheng Li, and Yu Liu. 2023. Safety-enhanced autonomous driving using interpretable sensor fusion transformer. In Conference on Robot Learning. PMLR, 726--737.
[25]
Jiamin Shi, Tangyike Zhang, Junxiang Zhan, Shitao Chen, Jingmin Xin, and Nanning Zheng. 2023. Efficient Lane-changing Behavior Planning via Reinforcement Learning with Imitation Learning Initialization. In 2023 IEEE Intelligent Vehicles Symposium (IV). IEEE, 1--8.
[26]
Justin Sia, Yu-Cheng Chang, Chin-Teng Lin, and Yu-Kai Wang. 2023. EEG-Based TNN for Driver Vigilance Monitoring. In 2023 IEEE Symposium Series on Computational Intelligence (SSCI). IEEE, 53--57.
[27]
Gustavo Silvera, Abhijat Biswas, and Henny Admoni. 2022. DReyeVR: Democratizing Virtual Reality Driving Simulation for Behavioural & Interaction Research. In Proceedings of the 2022 ACM/IEEE International Conference on Human-Robot Interaction. 639--643.
[28]
Haoran Song, Wenchao Ding, Yuxuan Chen, Shaojie Shen, Michael Yu Wang, and Qifeng Chen. 2020. Pip: Planning-informed trajectory prediction for autonomous driving. In European Conference on Computer Vision. Springer, 598--614.
[29]
Jingkai Sun, Qiang Zhang, Yiqun Duan, Xiaoyang Jiang, Chong Cheng, and Renjing Xu. 2024. Prompt, plan, perform: Llm-based humanoid control via quantized imitation learning. In 2024 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 16236--16242.
[30]
Lei Tai, Jingwei Zhang, Ming Liu, and Wolfram Burgard. 2018. Socially compliant navigation through raw depth inputs with generative adversarial imitation learning. In 2018 IEEE international conference on robotics and automation (ICRA). IEEE, 1111--1117.
[31]
Ardi Tampuu, Tambet Matiisen, Maksym Semikin, Dmytro Fishman, and Naveed Muhammad. 2020. A survey of end-to-end driving: Architectures and training methods. IEEE Transactions on Neural Networks and Learning Systems (2020).
[32]
Marin Toromanoff, Emilie Wirbel, and Fabien Moutarde. 2020. End-to-end model-free reinforcement learning for urban driving using implicit affordances. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 7153--7162.
[33]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Advances in Neural Information Processing Systems. 5998--6008.
[34]
Yu-Kai Wang, Tzyy-Ping Jung, and Chin-Teng Lin. 2015. EEG-based attention tracking during distracted driving. IEEE transactions on neural systems and rehabilitation engineering, Vol. 23, 6 (2015), 1085--1094.
[35]
Mingyun Wen, Jisun Park, and Kyungeun Cho. 2020. A scenario generation pipeline for autonomous vehicle simulators. Human-centric Computing and Information Sciences, Vol. 10, 1 (2020), 1--15.
[36]
Jingda Wu, Zhiyu Huang, Zhongxu Hu, and Chen Lv. 2023. Toward human-in-the-loop AI: Enhancing deep reinforcement learning via real-time human guidance for autonomous driving. Engineering, Vol. 21 (2023), 75--91.
[37]
Jingda Wu, Zhiyu Huang, Wenhui Huang, and Chen Lv. 2022. Prioritized experience-based reinforcement learning with human guidance for autonomous driving. IEEE Transactions on Neural Networks and Learning Systems (2022).
[38]
Jing Xu, Yu Pan, Xinglin Pan, Steven Hoi, Zhang Yi, and Zenglin Xu. 2022. RegNet: self-regulated network for image classification. IEEE Transactions on Neural Networks and Learning Systems (2022).
[39]
Qiang Zhang, Peter Cui, David Yan, Jingkai Sun, Yiqun Duan, Arthur Zhang, and Renjing Xu. 2024. Whole-body humanoid robot locomotion with human reference. arXiv preprint arXiv:2402.18294 (2024).
[40]
Qingwen Zhang, Mingkai Tang, Ruoyu Geng, Feiyi Chen, Ren Xin, and Lujia Wang. 2022. MMFN: Multi-Modal-Fusion-Net for End-to-End Driving. arXiv preprint arXiv:2207.00186 (2022).
[41]
Feng Zhou, X Jessie Yang, and Joost CF De Winter. 2021. Using eye-tracking data to predict situation awareness in real time during takeover transitions in conditionally automated driving. IEEE transactions on intelligent transportation systems, Vol. 23, 3 (2021), 2284--2295.
[42]
Jinzhao Zhou, Yiqun Duan, Ziyi Zhao, Yu-Cheng Chang, Yu-Kai Wang, Thomas Do, and Chin-Teng Lin. 2024. Towards Linguistic Neural Representation Learning and Sentence Retrieval from Electroencephalogram Recordings. arXiv preprint arXiv:2408.04679 (2024).
[43]
Jinzhao Zhou, Justin Sia, Yiqun Duan, Yu-Cheng Chang, Yu-Kai Wang, and Chin-Teng Lin. 2024. Masked EEG Modeling for Driving Intention Prediction. arXiv preprint arXiv:2408.07083 (2024).

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
BCIMM '24: Proceedings of the 1st International Workshop on Brain-Computer Interfaces (BCI) for Multimedia Understanding
October 2024
67 pages
ISBN:9798400711893
DOI:10.1145/3688862
  • Program Chairs:
  • Zehong (Jimmy) Cao,
  • Tzyy-Ping Jung,
  • Peng Xu
This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 October 2024

Check for updates

Author Tags

  1. brain-computer interfaces
  2. human-guided autonomous driving

Qualifiers

  • Research-article

Conference

MM '24
Sponsor:
MM '24: The 32nd ACM International Conference on Multimedia
October 28 - November 1, 2024
Melbourne VIC, Australia

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 116
    Total Downloads
  • Downloads (Last 12 months)116
  • Downloads (Last 6 weeks)41
Reflects downloads up to 05 Feb 2025

Other Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media