research-article

ANUBIS: a provenance graph-based framework for advanced persistent threat detection

Authors:

Md. Monowar Anjum,

Shahrear Iqbal,

Benoit HamelinAuthors Info & Claims

SAC '22: Proceedings of the 37th ACM/SIGAPP Symposium on Applied Computing

Pages 1684 - 1693

https://doi.org/10.1145/3477314.3507097

Published: 06 May 2022 Publication History

Abstract

We present ANUBIS, a highly effective machine learning-based APT detection system. Our design philosophy for ANUBIS involves two principal components. Firstly, we intend ANUBIS to be effectively utilized by cyber-response teams. Therefore, prediction explainability is one of the main focuses of ANUBIS design. Secondly, ANUBIS uses system provenance graphs to capture causality and thereby achieves high detection performance. At the core of the predictive capability of ANUBIS, there is a Bayesian Neural Network that can tell how confident it is in its predictions. We evaluate ANUBIS against a recent APT dataset (DARPA OpTC) and show that ANUBIS can detect malicious activity akin to APT campaigns with high accuracy. Moreover, ANUBIS learns about high-level patterns that allow it to explain its predictions to threat analysts. The high predictive performance with explainable attack story reconstruction makes ANUBIS an effective tool to use for enterprise cyber defense.

References

[1]

2020. Operationally Transparent Computing Data Release 5 (OpTC). (2020). https://github.com/FiveDirections/OpTC-data

[2]

Leman Akoglu, Hanghang Tong, and Danai Koutra. 2015. Graph based anomaly detection and description: a survey. Data mining and knowledge discovery 29, 3 (2015), 626--688.

Digital Library

[3]

Otis Alexander, Misha Belisle, and Jacob Steele. 2020. MITRE ATT&CK® for industrial control systems: Design and philosophy. The MITRE Corporation: Bedford, MA, USA (2020).

[4]

Abdulellah Alsaheel, Yuhong Nan, Shiqing Ma, Le Yu, Gregory Walkup, Z Berkay Celik, Xiangyu Zhang, and Dongyan Xu. 2021. {ATLAS}: A Sequence-based Learning Approach for Attack Investigation. In 30th {USENIX} Security Symposium ({USENIX} Security 21).

[5]

Adel Alshamrani, Sowmya Myneni, Ankur Chowdhary, and Dijiang Huang. 2019. A survey on advanced persistent threats: Techniques, solutions, challenges, and research opportunities. IEEE Communications Surveys & Tutorials 21, 2 (2019), 1851--1877.

[6]

Md. Monowar Anjum, Shahrear Iqbal, and Benoit Hamelin. 2021. Analyzing the Usefulness of the DARPA OpTC Dataset in Cyber Threat Detection Research. In Proceedings of the 26th ACM Symposium on Access Control Models and Technologies (Virtual Event, Spain) (SACMAT '21). Association for Computing Machinery, New York, NY, USA, 27--32.

Digital Library

[7]

Yulia De Bari. 2021. Cybersecurity Trends Report. https://www.infosys.com/iki/insights/2021-cybersecurity-trends-report.html. [Online; accessed 12-July-2021].

[8]

Mathieu Barre, Ashish Gehani, and Vinod Yegneswaran. 2019. Mining data provenance to detect advanced persistent threats. In 11th International Workshop on Theory and Practice of Provenance (TaPP 2019).

[9]

FireEye Threat Research Blog. 2020. Highly Evasive Attacker Leverages Solar-Winds Supply Chain to Compromise Multiple Global Victims With SUNBURST Backdoor. https://www.fireeye.com/blog/threat-research/2020/12/evasive-attacker-leverages-solarwinds-supply-chain-compromises-with-sunburst-backdoor.html. [Online; accessed 12-July-2021].

[10]

Charles Blundell, Julien Cornebise, Koray Kavukcuoglu, and Daan Wierstra. 2015. Weight uncertainty in neural network. In International Conference on Machine Learning. PMLR, 1613--1622.

[11]

Thomas Cochrane, Peter Foster, Varun Chhabra, Maud Lemercier, Cristopher Salvi, and Terry Lyons. 2021. SK-Tree: a systematic malware detection algorithm on streaming trees via the signature kernel. In IEEE International Conference On Cyber Security and Resilience.

[12]

Michael J Crowther, Richard D Riley, Jan A Staessen, Jiguang Wang, Francois Gueyffier, and Paul C Lambert. 2012. Individual patient data meta-analysis of survival data using Poisson regression models. BMC Medical Research Methodology 12, 1 (2012), 1--14.

[13]

Weidong Fang, Wuxiong Zhang, Wei Chen, Li Yi, and Weiwei Gao. 2021. PDTM: Poisson Distribution-based Trust Model for Web of Things. In Companion Proceedings of the Web Conference 2021. 85--89.

[14]

Henry Hanping Feng, Oleg M Kolesnikov, Prahlad Fogla, Wenke Lee, and Weibo Gong. 2003. Anomaly detection using call stack information. In 2003 Symposium on Security and Privacy, 2003. IEEE, 62--75.

[15]

FireEye. 2018. The Numbers Game: How Many Alerts are too Many to Handle? https://www.fireeye.com/offers/rpt-idc-the-numbers-game.html. [Online; accessed 19-July-2021].

[16]

FireEye-Mandiant. 2020. FireEye Mandiant M-Trends Report. https://content.fireeye.com/m-trends/rpt-m-trends-2021. [Online; accessed 12-July-2021].

[17]

Peng Gao, Xusheng Xiao, Ding Li, Zhichun Li, Kangkook Jee, Zhenyu Wu, Chung Hwan Kim, Sanjeev R Kulkarni, and Prateek Mittal. 2018. {SAQL}: A stream-based query system for real-time abnormal system behavior detection. In 27th {USENIX} Security Symposium ({USENIX} Security 18). 639--656.

[18]

Xueyuan Han, Thomas Pasquier, Adam Bates, James Mickens, and Margo Seltzer. 2020. Unicorn: Runtime provenance-based detector for advanced persistent threats. The Network and Distributed System Security (NDSS) (2020).

[19]

Xueyuan Han, Thomas Pasquier, Tanvi Ranjan, Mark Goldstein, and Margo Seltzer. 2017. Frappuccino: Fault-detection through runtime analysis of provenance. In 9th {USENIX} Workshop on Hot Topics in Cloud Computing (HotCloud 17).

[20]

Wajih Ul Hassan, Shengjian Guo, Ding Li, Zhengzhang Chen, Kangkook Jee, Zhichun Li, and Adam Bates. 2019. Nodoze: Combatting threat alert fatigue with automated provenance triage. In Network and Distributed Systems Security Symposium.

[21]

Wajih Ul Hassan, Mohammad Ali Noureddine, Pubali Datta, and Adam Bates. 2020. OmegaLog: High-fidelity attack investigation via transparent multi-layer log analysis. In Network and Distributed System Security Symposium.

[22]

Javad Hassannataj Joloudari, Mojtaba Haderbadi, Amir Mashmool, Mohammad Ghasemigol, Shahab S. Band, and Amir Mosavi. 2020. Early Detection of the Advanced Persistent Threat Attack Using Performance Analysis of Deep Learning. IEEE Access 8 (2020), 186125--186137. 3029202

[23]

Md Nahid Hossain, Sadegh M Milajerdi, Junao Wang, Birhanu Eshete, Rigel Gjomemo, R Sekar, Scott Stoller, and VN Venkatakrishnan. 2017. {SLEUTH}: Realtime attack scenario reconstruction from {COTS} audit data. In 26th {USENIX} Security Symposium ({USENIX} Security 17). 487--504.

[24]

Samuel T King and Peter M Chen. 2003. Backtracking intrusions. In Proceedings of the nineteenth ACM symposium on Operating systems principles. 223--236.

Digital Library

[25]

Yann LeCun, Corinna Cortes, and Christopher J Burges. 2010. MNIST handwritten digit database. 2010. URL http://yann.lecun.com/exdb/mnist 7 (2010), 23.

[26]

Yushan Liu, Mu Zhang, Ding Li, Kangkook Jee, Zhichun Li, Zhenyu Wu, Junghwan Rhee, and Prateek Mittal. 2018. Towards a Timely Causality Analysis for Enterprise Security. In NDSS.

[27]

Federico Maggi, Matteo Matteucci, and Stefano Zanero. 2008. Detecting intrusions through system call sequence and argument analysis. IEEE Transactions on Dependable and Secure Computing 7, 4 (2008), 381--395.

Digital Library

[28]

Emaad Manzoor, Sadegh M Milajerdi, and Leman Akoglu. 2016. Fast memory-efficient anomaly detection in streaming heterogeneous graphs. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 1035--1044.

Digital Library

[29]

Microsoft Vulnerability Database. 2019. CVE-2019-0604. https://msrc.microsoft.com/update-guide/en-US/vulnerability/CVE-2019-0604.

[30]

Microsoft Vulnerability Database. 2020. CVE-2020-0688. https://msrc.microsoft.com/update-guide/en-US/vulnerability/CVE-2020-0688.

[31]

Sadegh M Milajerdi, Birhanu Eshete, Rigel Gjomemo, and VN Venkatakrishnan. 2019. Poirot: Aligning attack behavior with kernel audit records for cyber threat hunting. In Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security. 1795--1812.

Digital Library

[32]

Sadegh M Milajerdi, Birhanu Eshete, Rigel Gjomemo, and Venkat N Venkatakrishnan. 2018. Propatrol: Attack investigation via extracted high-level tasks. In International Conference on Information Systems Security. Springer, 107--126.

[33]

Sadegh M. Milajerdi, Rigel Gjomemo, Birhanu Eshete, R. Sekar, and V.N. Venkatakrishnan. 2019. HOLMES: Real-Time APT Detection through Correlation of Suspicious Information Flows. In 2019 IEEE Symposium on Security and Privacy (SP). 1137--1152.

[34]

Darren Mutz, William Robertson, Giovanni Vigna, and Richard Kemmerer. 2007. Exploiting execution context for the detection of anomalous system calls. In International Workshop on Recent Advances in Intrusion Detection. Springer, 1--20.

[35]

Thomas Pasquier, Xueyuan Han, Mark Goldstein, Thomas Moyer, David Eyers, Margo Seltzer, and Jean Bacon. 2017. Practical whole-system provenance capture. In Proceedings of the 2017 Symposium on Cloud Computing. 405--418.

Digital Library

[36]

Kexin Pei, Zhongshu Gu, Brendan Saltaformaggio, Shiqing Ma, Fei Wang, Zhiwei Zhang, Luo Si, Xiangyu Zhang, and Dongyan Xu. 2016. Hercule: Attack story reconstruction via community discovery on correlated log graph. In Proceedings of the 32Nd Annual Conference on Computer Security Applications. 583--595.

Digital Library

[37]

Devin J. Pohly, Stephen McLaughlin, Patrick McDaniel, and Kevin Butler. 2012. Hi-Fi: Collecting high-fidelity whole-system provenance. In Proceedings - 28th Annual Computer Security Applications Conference, ACSAC 2012 (ACM International Conference Proceeding Series). 259--268. Copyright: Copyright 2013 Elsevier B.V., All rights reserved.; 28th Annual Computer Security Applications Conference, ACSAC 2012 ; Conference date: 03-12-2012 Through 07-12-2012.

Digital Library

[38]

R Sekar, Mugdha Bendre, Dinakar Dhurjati, and Pradeep Bollineni. 2000. A fast automaton-based method for detecting anomalous program behaviors. In Proceedings 2001 IEEE Symposium on Security and Privacy. S&P 2001. IEEE, 144--155.

[39]

Xiaokui Shu, Danfeng (Daphne) Yao, Naren Ramakrishnan, and Trent Jaeger. 2017. Long-Span Program Behavior Modeling and Attack Detection. ACM Trans. Priv. Secur. 20, 4, Article 12 (Sept. 2017), 28 pages.

Digital Library

[40]

Anil Somayaji and Stephanie Forrest. 2000. Automated Response Using System-Call Delay. In Usenix Security Symposium. 185--197.

[41]

Maddie Stone and Clement Lecigne. 2021. How we protect users from 0-day attacks. https://blog.google/threat-analysis-group/how-we-protect-users-0-day-attacks/. Google Threat Research Blog (2021).

[42]

Xiaoyan Sun, Jun Dai, Peng Liu, Anoop Singhal, and John Yen. 2018. Using Bayesian Networks for Probabilistic Identification of Zero-Day Attack Paths. IEEE Transactions on Information Forensics and Security 13, 10 (2018), 2506--2521.

[43]

W Symantec. 2011. Advanced persistent threats: A symantec perspective. Symantec World Headquarters (2011).

[44]

Qi Wang, Wajih Ul Hassan, Ding Li, Kangkook Jee, Xiao Yu, Kexuan Zou, Junghwan Rhee, Zhengzhang Chen, Wei Cheng, Carl A Gunter, et al. 2020. You Are What You Do: Hunting Stealthy Malware via Data Provenance Analysis. In NDSS.

[45]

Steve Scherer William James. 2020. Russia trying to steal COVID-19 vaccine data, say UK, U.S. and Canada. https://www.reuters.com/article/us-health-coronavirus-cyber-idUSKCN24H236. [Online; accessed 12-July-2021].

Cited By

Satpathy UBorse HChakraborty S(2025)Towards Generating a Robust, Scalable and Dynamic Provenance Graph for Attack Investigation over Distributed Microservice Architecture2025 17th International Conference on COMmunication Systems and NETworks (COMSNETS)10.1109/COMSNETS63942.2025.10885639(566-574)Online publication date: 6-Jan-2025
https://doi.org/10.1109/COMSNETS63942.2025.10885639
Addetla SPachamuthu R(2025)Amalgamation of Divergent Logs for Detection of Advanced Persistent Threats in Cyber Threat AnalysisFifth International Conference on Computing and Network Communications10.1007/978-981-97-4540-1_35(473-488)Online publication date: 6-Feb-2025
https://doi.org/10.1007/978-981-97-4540-1_35
Yin YHe XLiao Y(2024)APT Attack Detection Method Based on Traceability GraphJournal of Intelligence and Knowledge Engineering10.62517/jike.2024042152:2(82-85)Online publication date: Jun-2024
https://doi.org/10.62517/jike.202404215
Show More Cited By

Index Terms

ANUBIS: a provenance graph-based framework for advanced persistent threat detection
1. Security and privacy
  1. Intrusion/anomaly detection and malware mitigation
    1. Intrusion detection systems

Recommendations

Provenance-based Intrusion Detection Systems: A Survey
Traditional Intrusion Detection Systems (IDS) cannot cope with the increasing number and sophistication of cyberattacks such as Advanced Persistent Threats (APT). Due to their high false-positive rate and the required effort of security experts to ...
POIROT: Aligning Attack Behavior with Kernel Audit Records for Cyber Threat Hunting
CCS '19: Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security

Cyber threat intelligence (CTI) is being used to search for indicators of attacks that might have compromised an enterprise network for a long time without being discovered. To have a more effective analysis, CTI open standards have incorporated ...
Analyzing the Usefulness of the DARPA OpTC Dataset in Cyber Threat Detection Research
SACMAT '21: Proceedings of the 26th ACM Symposium on Access Control Models and Technologies

Maintaining security and privacy in real-world enterprise networks is becoming more and more challenging. Cyber actors are increasingly employing previously unreported and state-of-the-art techniques to break into corporate networks. To develop novel and ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

SAC '22: Proceedings of the 37th ACM/SIGAPP Symposium on Applied Computing

April 2022

2099 pages

ISBN:9781450387132

DOI:10.1145/3477314

Conference Chairs:
Jiman Hong
Soongsil University
,
Miroslav Bures
Czech Technical University, Czechia
,
Program Chairs:
Juw Won Park
University of Louisville
,
Tomas Cerny
Baylor University

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGAPP: ACM Special Interest Group on Applied Computing

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 06 May 2022

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

SAC '22

Sponsor:

SIGAPP

SAC '22: The 37th ACM/SIGAPP Symposium on Applied Computing

April 25 - 29, 2022

Virtual Event

Acceptance Rates

Overall Acceptance Rate 1,650 of 6,669 submissions, 25%

Upcoming Conference

SAC '25

Sponsor:
sigapp

The 40th ACM/SIGAPP Symposium on Applied Computing

March 31 - April 4, 2025

Catania , Italy

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

22
Total Citations
View Citations
757
Total Downloads

Downloads (Last 12 months)211
Downloads (Last 6 weeks)25

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Satpathy UBorse HChakraborty S(2025)Towards Generating a Robust, Scalable and Dynamic Provenance Graph for Attack Investigation over Distributed Microservice Architecture2025 17th International Conference on COMmunication Systems and NETworks (COMSNETS)10.1109/COMSNETS63942.2025.10885639(566-574)Online publication date: 6-Jan-2025
https://doi.org/10.1109/COMSNETS63942.2025.10885639
Addetla SPachamuthu R(2025)Amalgamation of Divergent Logs for Detection of Advanced Persistent Threats in Cyber Threat AnalysisFifth International Conference on Computing and Network Communications10.1007/978-981-97-4540-1_35(473-488)Online publication date: 6-Feb-2025
https://doi.org/10.1007/978-981-97-4540-1_35
Yin YHe XLiao Y(2024)APT Attack Detection Method Based on Traceability GraphJournal of Intelligence and Knowledge Engineering10.62517/jike.2024042152:2(82-85)Online publication date: Jun-2024
https://doi.org/10.62517/jike.202404215
Jalalvand FBaruwal Chhetri MNepal SParis C(2024)Alert Prioritisation in Security Operations Centres: A Systematic Survey on Criteria and MethodsACM Computing Surveys10.1145/369546257:2(1-36)Online publication date: 7-Nov-2024
https://dl.acm.org/doi/10.1145/3695462
Saha ABlasco JCavallaro LLindorfer M(2024)ADAPT it! Automating APT Campaign and Group Attribution by Leveraging and Linking Heterogeneous FilesProceedings of the 27th International Symposium on Research in Attacks, Intrusions and Defenses10.1145/3678890.3678909(114-129)Online publication date: 30-Sep-2024
https://dl.acm.org/doi/10.1145/3678890.3678909
Bhattarai BHuang H(2024)Prov2vec: Learning Provenance Graph Representation for Anomaly Detection in Computer SystemsProceedings of the 19th International Conference on Availability, Reliability and Security10.1145/3664476.3664494(1-14)Online publication date: 30-Jul-2024
https://dl.acm.org/doi/10.1145/3664476.3664494
Aly AIqbal SYoussef AMansour E(2024)MEGR-APT: A Memory-Efficient APT Hunting System Based on Attack Representation LearningIEEE Transactions on Information Forensics and Security10.1109/TIFS.2024.339639019(5257-5271)Online publication date: 2-May-2024
https://dl.acm.org/doi/10.1109/TIFS.2024.3396390
Li HLiu PLin BLiao YHuang Y(2024)IPMES: A Tool for Incremental TTP Detection Over the System Audit Event Stream2024 54th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)10.1109/DSN58291.2024.00036(265-273)Online publication date: 24-Jun-2024
https://doi.org/10.1109/DSN58291.2024.00036
Nikulshin VTalhi C(2024)Effective IDS under constraints of modern enterprise networks: revisiting the OpTC dataset2024 7th Conference on Cloud and Internet of Things (CIoT)10.1109/CIoT63799.2024.10757066(1-8)Online publication date: 29-Oct-2024
https://doi.org/10.1109/CIoT63799.2024.10757066
Mahboubi ALuong KAboutorab HBui HJarrad GBahutair MCamtepe SPogrebna GAhmed EBarry BGately H(2024)Evolving techniques in cyber threat huntingJournal of Network and Computer Applications10.1016/j.jnca.2024.104004232:COnline publication date: 1-Dec-2024
https://dl.acm.org/doi/10.1016/j.jnca.2024.104004
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten