Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1109/ICRA.2019.8793740guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
research-article

Generating Adversarial Driving Scenarios in High-Fidelity Simulators

Published: 20 May 2019 Publication History

Abstract

In recent years self-driving vehicles have become more commonplace on public roads, with the promise of bringing safety and efficiency to modern transportation systems. Increasing the reliability of these vehicles on the road requires an extensive suite of software tests, ideally performed on highfidelity simulators, where multiple vehicles and pedestrians interact with the self-driving vehicle. It is therefore of critical importance to ensure that self-driving software is assessed against a wide range of challenging simulated driving scenarios. The state of the art in driving scenario generation, as adopted by some of the front-runners of the self-driving car industry, still relies on human input [1]. In this paper we propose to automate the process using Bayesian optimization to generate adversarial self-driving scenarios that expose poorly-engineered or poorly-trained self-driving policies, and increase the risk of collision with simulated pedestrians and vehicles. We show that by incorporating the generated scenarios into the training set of the self-driving policy, and by fine-tuning the policy using vision-based imitation learning we obtain safer self-driving behavior.

References

[1]
A. C. Madrigal, “Inside waymo’s secret world for training self-driving cars,” The Atlantic, Aug 2017. [Online]. Available: https://www.theatlantic.com/technology/archive/2017/08/inside-waymos-secret-testing-and-simulation-facilities/537648/
[2]
K. Nidhi and S. M. Paddock, “Driving to Safety: How Many Miles of Driving Would It Take to Demonstrate Autonomous Vehicle Reliability?http://www.rand.org/pubs/research reports/RR1478.html, 2016, [Online; accessed 19-July-2018].
[3]
A. Dosovitskiy, G. Ros, F. Codevilla, A. Lopez, and V. Koltun, “CARLA: An open urban driving simulator,” in Proceedings of the 1st Annual Conference on Robot Learning, 2017, pp. 1–16.
[4]
G. Kahn, T. Zhang, S. Levine, and P. Abbeel, “PLATO: policy learning using adaptive trajectory optimization,” CoRR, vol. abs/1603.00622, 2016.
[5]
S. Ross, G. J. Gordon, and D. Bagnell, “A reduction of imitation learning and structured prediction to no-regret online learning,” in International Conference on Artificial Intelligence and Statistics, AISTATS, 2011, pp. 627–635.
[6]
B. Shahriari, K. Swersky, Z. Wang, R. P. Adams, and N. De Freitas, “Taking the human out of the loop: A review of bayesian optimization,” Proceedings of the IEEE, vol. 104, no. 1, pp. 148–175, 2016.
[7]
J. Snoek, H. Larochelle, and R. P. Adams, “Practical bayesian optimization of machine learning algorithms,” in Advances in neural information processing systems, 2012, pp. 2951–2959.
[8]
J. Mockus, “Bayesian approach to global optimization and application to multiobjective and constrained problems,” Journal of Optimization Theory and Applications, vol. 70, no. 1, pp. 157–172, Jul 1991.
[9]
S. Manjanna and G. Dudek, “Data-driven selective sampling for marine vehicles using multi-scale paths,” in IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Vancouver, Canada, September 2017, pp. 6111–6117.
[10]
S. R. Kuindersma, R. A. Grupen, and A. G. Barto, “Variable risk control via stochastic optimization,” The International Journal of Robotics Research, vol. 32, no. 7, pp. 806–825, 2013.
[11]
E. Brochu, V. M. Cora, and N. de Freitas, “A tutorial on bayesian optimization of expensive cost functions, with application to active user modeling and hierarchical reinforcement learning,” CoRR, no. arXiv:1012.2599, December 2010.
[12]
N. Srinivas, A. Krause, S. M. Kakade, and M. W. Seeger, “Information-theoretic regret bounds for gaussian process optimization in the bandit setting,” IEEE Transactions on Information Theory, vol. 58, no. 5, pp. 3250–3265, May 2012.
[13]
E. Kaufmann, N. Korda, and R. Munos, “Thompson sampling: An asymptotically optimal finite-time analysis,” in 23rd International Conference on Algorithmic Learning Theory, ser. ALT’12. SpringerVerlag, 2012, pp. 199–213.
[14]
I. Osband, C. Blundell, A. Pritzel, and B. Van Roy, “Deep exploration via bootstrapped dqn,” in Advances in Neural Information Processing Systems 29, 2016, pp. 4026–4034.
[15]
K. Azizzadenesheli, E. Brunskill, and A. Anandkumar, “Efficient exploration through bayesian deep q-networks,” CoRR, vol. abs/1802.04412, 2018.
[16]
Z. C. Lipton, J. Gao, L. Li, X. Li, F. Ahmed, and L. Deng, “Efficient exploration for dialog policy learning with deep BBQ networks,” CoRR, vol. abs/1608.05081, 2016.
[17]
M. Fortunato, M. G. Azar, B. Piot, J. Menick, I. Osband, A. Graves, V. Mnih, R. Munos, D. Hassabis, O. Pietquin, C. Blundell, and S. Legg, “Noisy networks for exploration,” CoRR, vol. abs/1706.10295, 2017.
[18]
A. Kumar, S. Zilberstein, and M. Toussaint, “Probabilistic inference techniques for scalable multiagent decision making,” Journal of Artificial Intelligence Research, vol. 53, no. 1, pp. 223–270, May 2015.
[19]
K. Rawlik, M. Toussaint, and S. Vijayakumar, “On stochastic optimal control and reinforcement learning by approximate inference,” in Proc. of Robotics: Science and Systems (R:SS 2012), 2012.
[20]
T. Haarnoja, H. Tang, P. Abbeel, and S. Levine, “Reinforcement learning with deep energy-based policies,” CoRR, vol. abs/1702.08165, 2017.
[21]
M. Littman, “Markov games as a framework for multi-agent reinforcement learning,” in International Conference on Machine Learning, 1994, pp. 157–163.
[22]
F. A. Oliehoek, Decentralized POMDPs. Springer Berlin Heidelberg, 2012, pp. 471–503.
[23]
J. Foerster, R. Y. Chen, M. Al-Shedivat, S. Whiteson, P. Abbeel, and I. Mordatch, “Learning with opponent-learning awareness,” in Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems. International Foundation for Autonomous Agents and Multiagent Systems, 2018, pp. 122–130.
[24]
R. Lowe, Y. Wu, A. Tamar, J. Harb, P. Abbeel, and I. Mordatch, “Multi-agent actor-critic for mixed cooperative-competitive environments,” CoRR, vol. abs/1706.02275, 2017.
[25]
N. Papernot, P. D. McDaniel, I. J. Goodfellow, S. Jha, Z. B. Celik, and A. Swami, “Practical black-box attacks against deep learning systems using adversarial examples,” CoRR, vol. abs/1602.02697, 2016.
[26]
S. H. Huang, N. Papernot, I. J. Goodfellow, Y. Duan, and P. Abbeel, “Adversarial attacks on neural network policies,” CoRR, vol. abs/1702.02284, 2017.
[27]
Y. Tian, K. Pei, S. Jana, and B. Ray, “Deeptest: Automated testing of deep-neural-network-driven autonomous cars,” in Proceedings of the 40th International Conference on Software Engineering. ACM, 2018, pp. 303–314.
[28]
F. M. P. Behbahani, K. Shiarlis, X. Chen, V. Kurin, S. Kasewa, C. Stirbu, J. Gomes, S. Paul, F. A. Oliehoek, J. V. Messias, and S. Whiteson, “Learning from demonstration in the wild,” CoRR, vol. abs/1811.03516, 2018.
[29]
J. Ho and S. Ermon, “Generative adversarial imitation learning,” CoRR, vol. abs/1606.03476, 2016.
[30]
M. Á. Carreira-Perpiñán and G. E. Hinton, “On contrastive divergence´ learning,” in AISTATS, 2005.
[31]
I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, “Generative adversarial nets,” in Advances in Neural Information Processing Systems 27, 2014, pp. 2672–2680.
[32]
T.-M. Li, M. Aittala, F. Durand, and J. Lehtinen, “Differentiable monte carlo ray tracing through edge sampling,” ACM Trans. Graph. (Proc. SIGGRAPH Asia), vol. 37, no. 6, pp. 222:1–222:11, 2018.
[33]
M. M. Loper and M. J. Black, “Opendr: An approximate differentiable renderer,” in European Conference on Computer Vision (ECCV), 2014, pp. 154–169.
[34]
J. Wu, E. Lu, P. Kohli, W. T. Freeman, and J. B. Tenenbaum, “Learning to see physics via visual de-animation,” in Advances in Neural Information Processing Systems, 2017.
[35]
C. E. Rasmussen, “Gaussian processes in machine learning,” in Advanced lectures on machine learning. Springer, 2004, pp. 63–71.
[36]
D. M. Blei, A. Kucukelbir, and J. D. McAuliffe, “Variational inference: A review for statisticians,” Journal of the American Statistical Association, vol. 112, no. 518, pp. 859–877, 2017.
[37]
Z. Wang, M. Zoghi, F. Hutter, D. Matheson, N. De Freitas et al., “Bayesian optimization in high dimensions via random embeddings.” in IJCAI, 2013, pp. 1778–1784.
[38]
C. Li, S. Gupta, S. Rana, V. Nguyen, S. Venkatesh, and A. Shilton, “High dimensional bayesian optimization using dropout,” in International Joint Conference on Artificial Intelligence, 2017, pp. 2096–2102.
[39]
F. Codevilla, M. Muller, A. Löpez, V. Koltun, and A. Dosovitskiy,´ “End-to-end driving via conditional imitation learning,” in International Conference on Robotics and Automation (ICRA), 2018.
[40]
M. Laskey, J. Lee, R. Fox, A. Dragan, and K. Goldberg, “Dart: Noise injection for robust imitation learning,” in Conference on Robot Learning, vol. 78. PMLR, 13-15 Nov 2017, pp. 143–156.
[41]
M. Bojarski, D. D. Testa, D. Dworakowski, B. Firner, B. Flepp, P. Goyal, L. D. Jackel, M. Monfort, U. Muller, J. Zhang, X. Zhang, J. Zhao, and K. Zieba, “End to end learning for self-driving cars,” CoRR, vol. abs/1604.07316, 2016.

Cited By

View all
  • (2024)Artificial Intelligence for Safety-Critical Systems in Industrial and Transportation Domains: A SurveyACM Computing Surveys10.1145/362631456:7(1-40)Online publication date: 9-Apr-2024
  • (2024)Semantic-guided fuzzing for virtual testing of autonomous driving systemsJournal of Systems and Software10.1016/j.jss.2024.112017212:COnline publication date: 1-Jun-2024
  • (2023)Discovering adversarial driving maneuvers against autonomous vehiclesProceedings of the 32nd USENIX Conference on Security Symposium10.5555/3620237.3620403(2957-2974)Online publication date: 9-Aug-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings
2019 International Conference on Robotics and Automation (ICRA)
May 2019
7095 pages

Publisher

IEEE Press

Publication History

Published: 20 May 2019

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 28 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Artificial Intelligence for Safety-Critical Systems in Industrial and Transportation Domains: A SurveyACM Computing Surveys10.1145/362631456:7(1-40)Online publication date: 9-Apr-2024
  • (2024)Semantic-guided fuzzing for virtual testing of autonomous driving systemsJournal of Systems and Software10.1016/j.jss.2024.112017212:COnline publication date: 1-Jun-2024
  • (2023)Discovering adversarial driving maneuvers against autonomous vehiclesProceedings of the 32nd USENIX Conference on Security Symposium10.5555/3620237.3620403(2957-2974)Online publication date: 9-Aug-2023
  • (2022)Evaluating Human–Robot Interaction Algorithms in Shared Autonomy via Quality Diversity Scenario GenerationACM Transactions on Human-Robot Interaction10.1145/347641211:3(1-30)Online publication date: 2-Sep-2022
  • (2021)An IVIS Typical Scene Generation Algorithm Based on Traffic Big DataProceedings of the 5th International Conference on Computer Science and Application Engineering10.1145/3487075.3487190(1-7)Online publication date: 19-Oct-2021

View Options

View options

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media