research-article

SoK: A Comparison of Autonomous Penetration Testing Agents

Authors:

Wim MeesAuthors Info & Claims

ARES '24: Proceedings of the 19th International Conference on Availability, Reliability and Security

Article No.: 41, Pages 1 - 10

https://doi.org/10.1145/3664476.3664484

Published: 30 July 2024 Publication History

Abstract

In the still growing field of cyber security, machine learning methods have largely been employed for detection tasks. Only a small portion revolves around offensive capabilities. Through the rise of Deep Reinforcement Learning, agents have also emerged with the goal of actively assessing the security of systems by the means of penetration testing. Thus learning the usage of different tools to emulate humans. In this paper we present an overview, and comparison of different autonomous penetration testing agents found within the literature. Various agents have been proposed, making use of distinct methods, but several factors such as modelling of the environment and scenarios, different algorithms, and the difference in chosen methods themselves, make it difficult to draw conclusions on the current state and performance of those agents. This comparison also lets us identify research challenges that present a major limiting factor, such as handling large action spaces, partial observability, defining the right reward structure, and learning in a real-world scenario.

References

[1]

Salim Al Wahaibi, Myles Foley, and Sergio Maffeis. 2023. { SQIRL} :{ Grey-Box} Detection of { SQL} Injection Vulnerabilities Using Reinforcement Learning. In 32nd USENIX Security Symposium (USENIX Security 23). 6097–6114.

[2]

Steven L. Brunton and J. Nathan Kutz. 2022. Data-Driven Science and Engineering: Machine Learning, Dynamical Systems, and Control (2 ed.). Cambridge University Press.

[3]

Jinyin Chen, Shulong Hu, Haibin Zheng, Changyou Xing, and Guomin Zhang. 2023. GAIL-PT: An intelligent penetration testing framework with generative adversarial imitation learning. Computers & Security 126 (March 2023), 103055. https://doi.org/10.1016/j.cose.2022.103055

Digital Library

[4]

Gelei Deng, Yi Liu, Víctor Mayoral-Vilches, Peng Liu, Yuekang Li, Yuan Xu, Tianwei Zhang, Yang Liu, Martin Pinzger, and Stefan Rass. 2023. PentestGPT: An LLM-empowered Automatic Penetration Testing Tool. http://arxiv.org/abs/2308.06782 arXiv:2308.06782 [cs].

[5]

Gabriel Dulac-Arnold, Richard Evans, Hado van Hasselt, Peter Sunehag, Timothy Lillicrap, Jonathan Hunt, Timothy Mann, Theophane Weber, Thomas Degris, and Ben Coppin. 2016. Deep Reinforcement Learning in Large Discrete Action Spaces. https://doi.org/10.48550/arXiv.1512.07679 arXiv:1512.07679 [cs, stat].

[6]

Gabriel Dulac-Arnold, Nir Levine, Daniel J Mankowitz, Jerry Li, Cosmin Paduraru, Sven Gowal, and Todd Hester. 2021. Challenges of real-world reinforcement learning: definitions, benchmarks and analysis. Machine Learning 110, 9 (2021), 2419–2468.

Digital Library

[7]

László Erdődi, Åvald Åslaugson Sommervoll, and Fabio Massimo Zennaro. 2021. Simulating SQL injection vulnerability exploitation using Q-learning reinforcement learning agents. Journal of Information Security and Applications 61 (Sept. 2021), 102903. https://doi.org/10.1016/j.jisa.2021.102903

Digital Library

[8]

Jonathan Esteban. 2022. Simulating Network Lateral Movements through the CyberBattleSim Web Platform. https://dspace.mit.edu/handle/1721.1/143191 Accepted: 2022-06-15T13:02:28Z.

[9]

Myles Foley and Sergio Maffeis. 2022. Haxss: Hierarchical Reinforcement Learning for XSS Payload Generation. In 2022 IEEE International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom). IEEE, Wuhan, China, 147–158. https://doi.org/10.1109/TrustCom56396.2022.00031

[10]

Hado van Hasselt, Arthur Guez, and David Silver. 2016. Deep Reinforcement Learning with Double Q-Learning. Proceedings of the AAAI Conference on Artificial Intelligence 30, 1 (March 2016). https://doi.org/10.1609/aaai.v30i1.10295 Number: 1.

[11]

Matthew Hausknecht and Peter Stone. 2017. Deep Recurrent Q-Learning for Partially Observable MDPs. http://arxiv.org/abs/1507.06527 arXiv:1507.06527 [cs].

[12]

Jaromír Janisch, Tomáš Pevný, and Viliam Lisý. 2023. NASimEmu: Network Attack Simulator & Emulator for Training Agents Generalizing to Novel Scenarios. https://doi.org/10.48550/arXiv.2305.17246 arXiv:2305.17246 [cs].

[13]

L. P. Kaelbling, M. L. Littman, and A. W. Moore. 1996. Reinforcement Learning: A Survey. Journal of Artificial Intelligence Research 4 (May 1996), 237–285. https://doi.org/10.1613/jair.301

[14]

Kalle Kujanpää, Willie Victor, and Alexander Ilin. 2021. Automating Privilege Escalation with Deep Reinforcement Learning. In Proceedings of the 14th ACM Workshop on Artificial Intelligence and Security. 157–168. https://doi.org/10.1145/3474369.3486877 arXiv:2110.01362 [cs].

Digital Library

[15]

Soyoung Lee, Seongil Wi, and Sooel Son. 2022. Link: Black-Box Detection of Cross-Site Scripting Vulnerabilities Using Reinforcement Learning. In Proceedings of the ACM Web Conference 2022. ACM, Virtual Event, Lyon France, 743–754. https://doi.org/10.1145/3485447.3512234

Digital Library

[16]

Zegang Li, Qian Zhang, and Guangwen Yang. 2023. EPPTA: Efficient Partially Observable Reinforcement Learning Agent for Penetration testing Applications. preprint. Preprints. https://doi.org/10.22541/au.169406476.64066230/v1

[17]

Ryusei Maeda and Mamoru Mimura. 2021. Automating post-exploitation with deep reinforcement learning. Computers & Security 100 (Jan. 2021), 102108. https://doi.org/10.1016/j.cose.2020.102108

Digital Library

[18]

Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A Rusu, Joel Veness, Marc G Bellemare, Alex Graves, Martin Riedmiller, Andreas K Fidjeland, Georg Ostrovski, 2015. Human-level control through deep reinforcement learning. nature 518, 7540 (2015), 529–533.

[19]

George E. Monahan. 1982. State of the Art—A Survey of Partially Observable Markov Decision Processes: Theory, Models, and Algorithms. Management Science 28, 1 (Jan. 1982), 1–16. https://doi.org/10.1287/mnsc.28.1.1

Digital Library

[20]

Hoang Nguyen, Songpon Teerakanok, Atsuo Inomata, and Tetsutaro Uehara. 2021. The Proposal of Double Agent Architecture using Actor-critic Algorithm for Penetration Testing:. In Proceedings of the 7th International Conference on Information Systems Security and Privacy. SCITEPRESS - Science and Technology Publications, Online Streaming, — Select a Country —, 440–449. https://doi.org/10.5220/0010232504400449

[21]

Thanh Thi Nguyen and Vijay Janapa Reddi. 2023. Deep Reinforcement Learning for Cyber Security. IEEE Transactions on Neural Networks and Learning Systems 34, 8 (Aug. 2023), 3779–3795. https://doi.org/10.1109/TNNLS.2021.3121870 Conference Name: IEEE Transactions on Neural Networks and Learning Systems.

[22]

Aleksei Petrenko, Zhehui Huang, Tushar Kumar, Gaurav S. Sukhatme, and Vladlen Koltun. 2020. Sample Factory: Egocentric 3D Control from Pixels at 100000 FPS with Asynchronous Reinforcement Learning. In Proceedings of the 37th International Conference on Machine Learning, ICML 2020, 13-18 July 2020, Virtual Event(Proceedings of Machine Learning Research, Vol. 119). PMLR, 7652–7662. http://proceedings.mlr.press/v119/petrenko20a.html

[23]

Van-Hau Pham, Hien Do Hoang, Phan Thanh Trung, Van Dinh Quoc, Trong-Nghia To, and Phan The Duy. 2023. Raij\=u: Reinforcement Learning-Guided Post-Exploitation for Automating Security Assessment of Network Systems. https://doi.org/10.48550/arXiv.2309.15518 arXiv:2309.15518 [cs].

[24]

Rapid7. [n. d.]. Metasploit. https://www.metasploit.com/.

[25]

Carlos Sarraute, Olivier Buffet, and Jörg Hoffmann. 2012. POMDPs Make Better Hackers: Accounting for Uncertainty in Penetration Testing. Proceedings of the AAAI Conference on Artificial Intelligence 26, 1 (2012), 1816–1824. https://doi.org/10.1609/aaai.v26i1.8363 Number: 1.

[26]

Carlos Sarraute, Olivier Buffet, and Joerg Hoffmann. 2013. Penetration Testing == POMDP Solving?http://arxiv.org/abs/1306.4714 arXiv:1306.4714 [cs].

[27]

Tom Schaul, John Quan, Ioannis Antonoglou, and David Silver. 2016. Prioritized Experience Replay. https://doi.org/10.48550/arXiv.1511.05952 arXiv:1511.05952 [cs].

[28]

John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. 2017. Proximal Policy Optimization Algorithms. https://doi.org/10.48550/arXiv.1707.06347 arXiv:1707.06347 [cs].

[29]

Jonathon Schwartz and Hanna Kurniawati. 2019. Autonomous penetration testing using reinforcement learning. arXiv preprint arXiv:1905.05965 (2019).

[30]

Jonathon Schwartz and Hanna Kurniawatti. 2019. NASim: Network Attack Simulator. https://networkattacksimulator.readthedocs.io/.

[31]

Kamran Shaukat, Suhuai Luo, Vijay Varadharajan, Ibrahim A. Hameed, and Min Xu. 2020. A Survey on Machine Learning Techniques for Cyber Security in the Last Decade. IEEE Access 8 (2020), 222310–222354. https://doi.org/10.1109/ACCESS.2020.3041951

[32]

Maxwell Standen, Martin Lucas, David Bowman, Toby J. Richer, Junae Kim, and Damian Marriott. 2021. CybORG: A Gym for the Development of Autonomous Cyber Agents. https://doi.org/10.48550/arXiv.2108.09118 arXiv:2108.09118 [cs].

[33]

Richard S. Sutton and Andrew G. Barto. 2018. Reinforcement Learning: An Introduction. A Bradford Book, Cambridge, MA, USA.

[34]

Isao Takaesu. [n. d.]. DeepExploit. https://github.com/13o-bbr-bbq/machine_learning_security/tree/master/DeepExploit.

[35]

Microsoft Defender Research Team.2021. CyberBattleSim. https://github.com/microsoft/cyberbattlesim. Created by Christian Seifert, Michael Betser, William Blum, James Bono, Kate Farris, Emily Goren, Justin Grana, Kristian Holsheimer, Brandon Marken, Joshua Neil, Nicole Nichols, Jugal Parikh, Haoran Wei.

[36]

Khuong Tran, Ashlesha Akella, Maxwell Standen, Junae Kim, David Bowman, Toby Richer, and Chin-Teng Lin. 2021. Deep hierarchical reinforcement agents for automated penetration testing. http://arxiv.org/abs/2109.06449 arXiv:2109.06449 [cs].

[37]

Peter Vamplew, Benjamin J. Smith, Johan Källström, Gabriel Ramos, Roxana Rădulescu, Diederik M. Roijers, Conor F. Hayes, Fredrik Heintz, Patrick Mannion, Pieter J. K. Libin, Richard Dazeley, and Cameron Foale. 2022. Scalar reward is not enough: a response to Silver, Singh, Precup and Sutton (2021). Autonomous Agents and Multi-Agent Systems 36, 2 (July 2022), 41. https://doi.org/10.1007/s10458-022-09575-5

Digital Library

[38]

Yongjie Wang, Yang Li, Xinli Xiong, Jingye Zhang, Qian Yao, and Chuanxin Shen. 2023. DQfD-AIPT: An Intelligent Penetration Testing Framework Incorporating Expert Demonstration Data. Security and Communication Networks 2023 (May 2023), e5834434. https://doi.org/10.1155/2023/5834434 Publisher: Hindawi.

[39]

Christopher J. C. H. Watkins and Peter Dayan. 1992. Q-learning. Machine Learning 8, 3 (May 1992), 279–292. https://doi.org/10.1007/BF00992698

Digital Library

[40]

Yizhou Yang and Xin Liu. 2022. Behaviour-Diverse Automatic Penetration Testing: A Curiosity-Driven Multi-Objective Deep Reinforcement Learning Approach. http://arxiv.org/abs/2202.10630 arXiv:2202.10630 [cs].

[41]

Shicheng Zhou, Jingju Liu, Dongdong Hou, Xiaofeng Zhong, and Yue Zhang. 2021. Autonomous Penetration Testing Based on Improved Deep Q-Network. Applied Sciences 11, 19 (Jan. 2021), 8823. https://doi.org/10.3390/app11198823 Number: 19 Publisher: Multidisciplinary Digital Publishing Institute.

Index Terms

SoK: A Comparison of Autonomous Penetration Testing Agents

Recommendations

Modelling penetration testing with reinforcement learning using capture‐the‐flag challenges: Trade‐offs between model‐free learning and a priori knowledge
Abstract
Penetration testing is a security exercise aimed at assessing the security of a system by simulating attacks against it. So far, penetration testing has been carried out mainly by trained human attackers and its success critically depended on the ...

In this paper, we focus our attention on simplified penetration testing problems expressed in the form of capture the flag hacking challenges, and we analyse how model‐free reinforcement learning algorithms may help solving them. We demonstrate how the ...
A Penetration Testing Method for E-Commerce Authentication System Security
ICMECG '09: Proceedings of the 2009 International Conference on Management of e-Commerce and e-Government

E-Commerce systems are suffering more and more security issues. Vulnerabilities of authentication systems are revealed when various attacks and malicious abuses are developed and deployed to violate security of system and information. To improve the ...
Autonomous Learning Agents: Layered Learning and Ad Hoc Teamwork
AAMAS '16: Proceedings of the 2016 International Conference on Autonomous Agents & Multiagent Systems

In order to achieve long-term autonomy in the real world, fully autonomous agents need to be able to learn, both to improve their behaviors in a complex, dynamically changing world, and to enable interaction with previously unfamiliar agents. This talk ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences

ARES '24: Proceedings of the 19th International Conference on Availability, Reliability and Security

July 2024

2032 pages

ISBN:9798400717185

DOI:10.1145/3664476

Copyright © 2024 ACM.

Publication rights licensed to ACM. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of a national government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 July 2024

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

ARES 2024

ARES 2024: The 19th International Conference on Availability, Reliability and Security

July 30 - August 2, 2024

Vienna, Austria

Acceptance Rates

Overall Acceptance Rate 228 of 451 submissions, 51%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
82
Total Downloads

Downloads (Last 12 months)82
Downloads (Last 6 weeks)25

Reflects downloads up to 27 Nov 2024

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Table of Contents