Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3664476.3664484acmotherconferencesArticle/Chapter ViewAbstractPublication PagesaresConference Proceedingsconference-collections
research-article

SoK: A Comparison of Autonomous Penetration Testing Agents

Published: 30 July 2024 Publication History

Abstract

In the still growing field of cyber security, machine learning methods have largely been employed for detection tasks. Only a small portion revolves around offensive capabilities. Through the rise of Deep Reinforcement Learning, agents have also emerged with the goal of actively assessing the security of systems by the means of penetration testing. Thus learning the usage of different tools to emulate humans. In this paper we present an overview, and comparison of different autonomous penetration testing agents found within the literature. Various agents have been proposed, making use of distinct methods, but several factors such as modelling of the environment and scenarios, different algorithms, and the difference in chosen methods themselves, make it difficult to draw conclusions on the current state and performance of those agents. This comparison also lets us identify research challenges that present a major limiting factor, such as handling large action spaces, partial observability, defining the right reward structure, and learning in a real-world scenario.

References

[1]
Salim Al Wahaibi, Myles Foley, and Sergio Maffeis. 2023. { SQIRL} :{ Grey-Box} Detection of { SQL} Injection Vulnerabilities Using Reinforcement Learning. In 32nd USENIX Security Symposium (USENIX Security 23). 6097–6114.
[2]
Steven L. Brunton and J. Nathan Kutz. 2022. Data-Driven Science and Engineering: Machine Learning, Dynamical Systems, and Control (2 ed.). Cambridge University Press.
[3]
Jinyin Chen, Shulong Hu, Haibin Zheng, Changyou Xing, and Guomin Zhang. 2023. GAIL-PT: An intelligent penetration testing framework with generative adversarial imitation learning. Computers & Security 126 (March 2023), 103055. https://doi.org/10.1016/j.cose.2022.103055
[4]
Gelei Deng, Yi Liu, Víctor Mayoral-Vilches, Peng Liu, Yuekang Li, Yuan Xu, Tianwei Zhang, Yang Liu, Martin Pinzger, and Stefan Rass. 2023. PentestGPT: An LLM-empowered Automatic Penetration Testing Tool. http://arxiv.org/abs/2308.06782 arXiv:2308.06782 [cs].
[5]
Gabriel Dulac-Arnold, Richard Evans, Hado van Hasselt, Peter Sunehag, Timothy Lillicrap, Jonathan Hunt, Timothy Mann, Theophane Weber, Thomas Degris, and Ben Coppin. 2016. Deep Reinforcement Learning in Large Discrete Action Spaces. https://doi.org/10.48550/arXiv.1512.07679 arXiv:1512.07679 [cs, stat].
[6]
Gabriel Dulac-Arnold, Nir Levine, Daniel J Mankowitz, Jerry Li, Cosmin Paduraru, Sven Gowal, and Todd Hester. 2021. Challenges of real-world reinforcement learning: definitions, benchmarks and analysis. Machine Learning 110, 9 (2021), 2419–2468.
[7]
László Erdődi, Åvald Åslaugson Sommervoll, and Fabio Massimo Zennaro. 2021. Simulating SQL injection vulnerability exploitation using Q-learning reinforcement learning agents. Journal of Information Security and Applications 61 (Sept. 2021), 102903. https://doi.org/10.1016/j.jisa.2021.102903
[8]
Jonathan Esteban. 2022. Simulating Network Lateral Movements through the CyberBattleSim Web Platform. https://dspace.mit.edu/handle/1721.1/143191 Accepted: 2022-06-15T13:02:28Z.
[9]
Myles Foley and Sergio Maffeis. 2022. Haxss: Hierarchical Reinforcement Learning for XSS Payload Generation. In 2022 IEEE International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom). IEEE, Wuhan, China, 147–158. https://doi.org/10.1109/TrustCom56396.2022.00031
[10]
Hado van Hasselt, Arthur Guez, and David Silver. 2016. Deep Reinforcement Learning with Double Q-Learning. Proceedings of the AAAI Conference on Artificial Intelligence 30, 1 (March 2016). https://doi.org/10.1609/aaai.v30i1.10295 Number: 1.
[11]
Matthew Hausknecht and Peter Stone. 2017. Deep Recurrent Q-Learning for Partially Observable MDPs. http://arxiv.org/abs/1507.06527 arXiv:1507.06527 [cs].
[12]
Jaromír Janisch, Tomáš Pevný, and Viliam Lisý. 2023. NASimEmu: Network Attack Simulator & Emulator for Training Agents Generalizing to Novel Scenarios. https://doi.org/10.48550/arXiv.2305.17246 arXiv:2305.17246 [cs].
[13]
L. P. Kaelbling, M. L. Littman, and A. W. Moore. 1996. Reinforcement Learning: A Survey. Journal of Artificial Intelligence Research 4 (May 1996), 237–285. https://doi.org/10.1613/jair.301
[14]
Kalle Kujanpää, Willie Victor, and Alexander Ilin. 2021. Automating Privilege Escalation with Deep Reinforcement Learning. In Proceedings of the 14th ACM Workshop on Artificial Intelligence and Security. 157–168. https://doi.org/10.1145/3474369.3486877 arXiv:2110.01362 [cs].
[15]
Soyoung Lee, Seongil Wi, and Sooel Son. 2022. Link: Black-Box Detection of Cross-Site Scripting Vulnerabilities Using Reinforcement Learning. In Proceedings of the ACM Web Conference 2022. ACM, Virtual Event, Lyon France, 743–754. https://doi.org/10.1145/3485447.3512234
[16]
Zegang Li, Qian Zhang, and Guangwen Yang. 2023. EPPTA: Efficient Partially Observable Reinforcement Learning Agent for Penetration testing Applications. preprint. Preprints. https://doi.org/10.22541/au.169406476.64066230/v1
[17]
Ryusei Maeda and Mamoru Mimura. 2021. Automating post-exploitation with deep reinforcement learning. Computers & Security 100 (Jan. 2021), 102108. https://doi.org/10.1016/j.cose.2020.102108
[18]
Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A Rusu, Joel Veness, Marc G Bellemare, Alex Graves, Martin Riedmiller, Andreas K Fidjeland, Georg Ostrovski, 2015. Human-level control through deep reinforcement learning. nature 518, 7540 (2015), 529–533.
[19]
George E. Monahan. 1982. State of the Art—A Survey of Partially Observable Markov Decision Processes: Theory, Models, and Algorithms. Management Science 28, 1 (Jan. 1982), 1–16. https://doi.org/10.1287/mnsc.28.1.1
[20]
Hoang Nguyen, Songpon Teerakanok, Atsuo Inomata, and Tetsutaro Uehara. 2021. The Proposal of Double Agent Architecture using Actor-critic Algorithm for Penetration Testing:. In Proceedings of the 7th International Conference on Information Systems Security and Privacy. SCITEPRESS - Science and Technology Publications, Online Streaming, — Select a Country —, 440–449. https://doi.org/10.5220/0010232504400449
[21]
Thanh Thi Nguyen and Vijay Janapa Reddi. 2023. Deep Reinforcement Learning for Cyber Security. IEEE Transactions on Neural Networks and Learning Systems 34, 8 (Aug. 2023), 3779–3795. https://doi.org/10.1109/TNNLS.2021.3121870 Conference Name: IEEE Transactions on Neural Networks and Learning Systems.
[22]
Aleksei Petrenko, Zhehui Huang, Tushar Kumar, Gaurav S. Sukhatme, and Vladlen Koltun. 2020. Sample Factory: Egocentric 3D Control from Pixels at 100000 FPS with Asynchronous Reinforcement Learning. In Proceedings of the 37th International Conference on Machine Learning, ICML 2020, 13-18 July 2020, Virtual Event(Proceedings of Machine Learning Research, Vol. 119). PMLR, 7652–7662. http://proceedings.mlr.press/v119/petrenko20a.html
[23]
Van-Hau Pham, Hien Do Hoang, Phan Thanh Trung, Van Dinh Quoc, Trong-Nghia To, and Phan The Duy. 2023. Raij\=u: Reinforcement Learning-Guided Post-Exploitation for Automating Security Assessment of Network Systems. https://doi.org/10.48550/arXiv.2309.15518 arXiv:2309.15518 [cs].
[24]
Rapid7. [n. d.]. Metasploit. https://www.metasploit.com/.
[25]
Carlos Sarraute, Olivier Buffet, and Jörg Hoffmann. 2012. POMDPs Make Better Hackers: Accounting for Uncertainty in Penetration Testing. Proceedings of the AAAI Conference on Artificial Intelligence 26, 1 (2012), 1816–1824. https://doi.org/10.1609/aaai.v26i1.8363 Number: 1.
[26]
Carlos Sarraute, Olivier Buffet, and Joerg Hoffmann. 2013. Penetration Testing == POMDP Solving?http://arxiv.org/abs/1306.4714 arXiv:1306.4714 [cs].
[27]
Tom Schaul, John Quan, Ioannis Antonoglou, and David Silver. 2016. Prioritized Experience Replay. https://doi.org/10.48550/arXiv.1511.05952 arXiv:1511.05952 [cs].
[28]
John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. 2017. Proximal Policy Optimization Algorithms. https://doi.org/10.48550/arXiv.1707.06347 arXiv:1707.06347 [cs].
[29]
Jonathon Schwartz and Hanna Kurniawati. 2019. Autonomous penetration testing using reinforcement learning. arXiv preprint arXiv:1905.05965 (2019).
[30]
Jonathon Schwartz and Hanna Kurniawatti. 2019. NASim: Network Attack Simulator. https://networkattacksimulator.readthedocs.io/.
[31]
Kamran Shaukat, Suhuai Luo, Vijay Varadharajan, Ibrahim A. Hameed, and Min Xu. 2020. A Survey on Machine Learning Techniques for Cyber Security in the Last Decade. IEEE Access 8 (2020), 222310–222354. https://doi.org/10.1109/ACCESS.2020.3041951
[32]
Maxwell Standen, Martin Lucas, David Bowman, Toby J. Richer, Junae Kim, and Damian Marriott. 2021. CybORG: A Gym for the Development of Autonomous Cyber Agents. https://doi.org/10.48550/arXiv.2108.09118 arXiv:2108.09118 [cs].
[33]
Richard S. Sutton and Andrew G. Barto. 2018. Reinforcement Learning: An Introduction. A Bradford Book, Cambridge, MA, USA.
[34]
Isao Takaesu. [n. d.]. DeepExploit. https://github.com/13o-bbr-bbq/machine_learning_security/tree/master/DeepExploit.
[35]
Microsoft Defender Research Team.2021. CyberBattleSim. https://github.com/microsoft/cyberbattlesim. Created by Christian Seifert, Michael Betser, William Blum, James Bono, Kate Farris, Emily Goren, Justin Grana, Kristian Holsheimer, Brandon Marken, Joshua Neil, Nicole Nichols, Jugal Parikh, Haoran Wei.
[36]
Khuong Tran, Ashlesha Akella, Maxwell Standen, Junae Kim, David Bowman, Toby Richer, and Chin-Teng Lin. 2021. Deep hierarchical reinforcement agents for automated penetration testing. http://arxiv.org/abs/2109.06449 arXiv:2109.06449 [cs].
[37]
Peter Vamplew, Benjamin J. Smith, Johan Källström, Gabriel Ramos, Roxana Rădulescu, Diederik M. Roijers, Conor F. Hayes, Fredrik Heintz, Patrick Mannion, Pieter J. K. Libin, Richard Dazeley, and Cameron Foale. 2022. Scalar reward is not enough: a response to Silver, Singh, Precup and Sutton (2021). Autonomous Agents and Multi-Agent Systems 36, 2 (July 2022), 41. https://doi.org/10.1007/s10458-022-09575-5
[38]
Yongjie Wang, Yang Li, Xinli Xiong, Jingye Zhang, Qian Yao, and Chuanxin Shen. 2023. DQfD-AIPT: An Intelligent Penetration Testing Framework Incorporating Expert Demonstration Data. Security and Communication Networks 2023 (May 2023), e5834434. https://doi.org/10.1155/2023/5834434 Publisher: Hindawi.
[39]
Christopher J. C. H. Watkins and Peter Dayan. 1992. Q-learning. Machine Learning 8, 3 (May 1992), 279–292. https://doi.org/10.1007/BF00992698
[40]
Yizhou Yang and Xin Liu. 2022. Behaviour-Diverse Automatic Penetration Testing: A Curiosity-Driven Multi-Objective Deep Reinforcement Learning Approach. http://arxiv.org/abs/2202.10630 arXiv:2202.10630 [cs].
[41]
Shicheng Zhou, Jingju Liu, Dongdong Hou, Xiaofeng Zhong, and Yue Zhang. 2021. Autonomous Penetration Testing Based on Improved Deep Q-Network. Applied Sciences 11, 19 (Jan. 2021), 8823. https://doi.org/10.3390/app11198823 Number: 19 Publisher: Multidisciplinary Digital Publishing Institute.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
ARES '24: Proceedings of the 19th International Conference on Availability, Reliability and Security
July 2024
2032 pages
ISBN:9798400717185
DOI:10.1145/3664476
Publication rights licensed to ACM. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of a national government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 July 2024

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Deep Reinforcement Learning
  2. Penetration Testing
  3. Reinforcement Learning
  4. Security Automation

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

ARES 2024

Acceptance Rates

Overall Acceptance Rate 228 of 451 submissions, 51%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 82
    Total Downloads
  • Downloads (Last 12 months)82
  • Downloads (Last 6 weeks)25
Reflects downloads up to 27 Nov 2024

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media