research-article

High Fidelity Data Reduction for Big Data Security Dependency Analyses

Authors:

Guofei JiangAuthors Info & Claims

CCS '16: Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security

Pages 504 - 516

https://doi.org/10.1145/2976749.2978378

Published: 24 October 2016 Publication History

Abstract

Intrusive multi-step attacks, such as Advanced Persistent Threat (APT) attacks, have plagued enterprises with significant financial losses and are the top reason for enterprises to increase their security budgets. Since these attacks are sophisticated and stealthy, they can remain undetected for years if individual steps are buried in background "noise." Thus, enterprises are seeking solutions to "connect the suspicious dots" across multiple activities. This requires ubiquitous system auditing for long periods of time, which in turn causes overwhelmingly large amount of system audit events. Given a limited system budget, how to efficiently handle ever-increasing system audit logs is a great challenge. This paper proposes a new approach that exploits the dependency among system events to reduce the number of log entries while still supporting high-quality forensic analysis. In particular, we first propose an aggregation algorithm that preserves the dependency of events during data reduction to ensure the high quality of forensic analysis. Then we propose an aggressive reduction algorithm and exploit domain knowledge for further data reduction. To validate the efficacy of our proposed approach, we conduct a comprehensive evaluation on real-world auditing systems using log traces of more than one month. Our evaluation results demonstrate that our approach can significantly reduce the size of system logs and improve the efficiency of forensic analysis without losing accuracy.

References

[1]

Anthem cyber attack. http://abcnews.go.com/Business/anthem-cyber-attack-things-happen-personal-information/story?id=28747729.

[2]

Case study: The Home Depot data breach. https://www.sans.org/reading-room/whitepapers/casestudies/case-study-home-depot-data-breach-36367.

[3]

CVE-2004--2687. https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2004--2687.

[4]

CVE-2012--1823. https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2012--1823.

[5]

Cyber kill chain. http://www.lockheedmartin.com/us/what-we-do/information-technology/cybersecurity/tradecraft/cyber-kill-chain.html.

[6]

Ebay inc. to ask Ebay users to change passwords. http://blog.ebay.com/ebay-inc-ask-ebay-users-change-passwords/.

[7]

Etw events in the common language runtime. https://msdn.microsoft.com/en-us/library/ff357719(v=vs.110).aspx.

[8]

The Linux audit framework. https://www.suse.com/documentation/sled10/audit_sp1/data/book_sle_audit.html.

[9]

OPM government data breach impacted 21.5 million. http://www.cnn.com/2015/07/09/politics/office-of-personnel-management-data-breach-20-million/.

[10]

Sony reports 24.5 million more accounts hacked. http://www.darkreading.com/attacks-and-breaches/sony-reports-245-million-more-accounts-hacked/d/d-id/1097499

[11]

Stuxnet. https://en.wikipedia.org/wiki/Stuxnet.

[12]

Target hit by credit-card breach. http://online.wsj.com/news/articles/SB10001424052702304773104579266743230242538.

[13]

Transparent computing. http://www.darpa.mil/program/transparent-computing.

[14]

Trustwave Global Security Report, 2015. https://www2.trustwave.com/rs/815-RFM-693/images/2015_TrustwaveGlobalSecurityReport.pdf.

[15]

J. A. Ambrose, J. Peddersen, S. Parameswaran, A. Labios, and Y. Yachide. Sdg2kpn: System dependency graph to function-level kpn generation of legacy code for mpsocs. In Proceedings of IEEE ASP-DAC'14, pages 267--273.

[16]

S. Bleikertz, C. Vogel, and T. Groß. Cloud radar: near real-time detection of security failures in dynamic virtualized infrastructures. In Proceedings of ACM ACSAC'14, pages 26--35.

Digital Library

[17]

C. Castelluccia, E. Mykletun, and G. Tsudik. Efficient aggregation of encrypted data in wireless sensor networks. In Proceedings of IEEE MobiQuitous'05, pages 109--117.

Digital Library

[18]

A. P. Chapman, H. V. Jagadish, and P. Ramanan. Efficient provenance storage. In Proceedings of the ACM SIGMOD'08, pages 993--1006.

Digital Library

[19]

S. Cheng and J. Li. Sampling based (epsilon, delta)-approximate aggregation algorithm in sensor networks. In Proceedings of IEEE ICDCS'09, pages 273--280.

Digital Library

[20]

S. Cheng, J. Li, Q. Ren, and L. Yu. Bernoulli sampling based (ε, δ)-approximate aggregation in large-scale sensor networks. In Proceedings of IEEE INFOCOM'10, pages 1181--1189.

Digital Library

[21]

G. Cormode and K. Yi. Tracking distributed aggregates over time-based sliding windows. In Proceedings of SSDBM'12, pages 416--430.

Digital Library

[22]

N. Ghosh, I. Chokshi, M. Sarkar, S. K. Ghosh, A. K. Kaushik, and S. K. Das. Netsecuritas: An integrated attack graph-based security assessment tool for enterprise networks. In Proceedings of ACM ICDCN'15.

Digital Library

[23]

A. Goel, K. Po, K. Farhadi, Z. Li, and E. De Lara. The taser intrusion recovery system. In Proceedings of ACM SOSP'05, pages 163--176.

Digital Library

[24]

M. Gupta, J. Gao, X. Yan, H. Cam, and J. Han. Top-k interesting subgraph discovery in information networks. In Proceedings of IEEE ICDE'14, pages 820--831.

[25]

X. Jiang, A. Walters, D. Xu, E. H. Spafford, F. Buchholz, and Y.-M. Wang. Provenance-aware tracing ofworm break-in and contaminations: A process coloring approach. In Proceedings of IEEE ICDCS'06.

Digital Library

[26]

J. Jose and S. Manoj Kumar. Energy efficient recoverable concealed data aggregation in wireless sensor networks. In Proceedings of IEEE ICE-CCN'13, pages 322--329.

[27]

S. T. King and P. M. Chen. Backtracking intrusions. In Proceedings of ACM SOSP'03.

Digital Library

[28]

S. T. King, Z. M. Mao, D. G. Lucchetti, and P. M. Chen. Enriching intrusion alerts through multi-host causality. In Proceedings of the Network and Distributed System Security Symposium, NDSS 2005, San Diego, California, USA, 2005.

[29]

S. Krishnan, K. Z. Snow, and F. Monrose. Trail of bytes: efficient support for forensic analysis. In Proceedings of ACM CCS'10, pages 50--60.

Digital Library

[30]

K. H. Lee, X. Zhang, and D. Xu. High accuracy attack provenance via binary-based execution partition. In Proceedings of NDSS'13.

[31]

K. H. Lee, X. Zhang, and D. Xu. Loggc: garbage collecting audit log. In Proceedings of ACM CCS'13, pages 1005--1016.

Digital Library

[32]

J. Liu, C. Fang, and N. Ansari. Identifying user clicks based on dependency graph. In Proceedings of IEEE WOCC'14, pages 1--5.

[33]

S. Ma, K. H. Lee, C. H. Kim, J. Rhee, X. Zhang, and D. Xu. Accurate, low cost and instrumentation-free security audit logging for windows. In Proceedings of ACM ACSAC'15.

Digital Library

[34]

S. Ma, X. Zhang, and D. Xu. Protracer: Towards practical provenance tracing by alternating between logging and tainting. In Proceedings of NDSS'16.

[35]

M. Rezvani, A. Ignjatovic, E. Bertino, and S. Jha. Provenance-aware security risk analysis for hosts and network flows. In Proceedings of IEEE NOMS'14, pages 1--8.

[36]

S. Sitaraman and S. Venkatesan. Forensic analysis of file system intrusions using improved backtracking. In Proceedings of IEEE IWIA'05, pages 154--163.

Digital Library

[37]

Y. Xie, D. Feng, Z. Tan, L. Chen, K.-K. Muniswamy-Reddy, Y. Li, and D. D. Long. A hybrid approach for efficient provenance storage. In Proceedings of ACM CIKM'12, pages 1752--1756.

Digital Library

[38]

X. Xu, R. Ansari, A. Khokhar, and A. V. Vasilakos. Hierarchical data aggregation using compressive sensing (hdacs) in wsns. ACM Transactions on Sensor Networks, 11(3):45, 2015.

Digital Library

[39]

M. Zhang, Y. Duan, H. Yin, and Z. Zhao. Semantics-aware android malware classification using weighted contextual api dependency graphs. In Proceedings of ACM SIGSAC'14, pages 1105--1116.

Digital Library

[40]

M. Zibaeenejad and J. Thistle. Dependency graph: an algorithm for analysis of generalized parameterized networks. In Proceedings of IEEE ACC'15, pages 696--702.

[41]

T. Zimmermann and N. Nagappan. Predicting subsystem failures using dependency graph complexities. In Proceedings of ISSRE'07, pages 227--236.

Digital Library

Cited By

Li LChen W(2024)ConGraph: Advanced Persistent Threat Detection Method Based on Provenance Graph Combined with Process Context in Cyber-Physical System EnvironmentElectronics10.3390/electronics1305094513:5(945)Online publication date: 29-Feb-2024
https://doi.org/10.3390/electronics13050945
CHEN CWAN HZHAO X(2024)Log refusion: adversarial attacks against the integrity of application logs and defense methodsSCIENTIA SINICA Informationis10.1360/SSI-2024-004254:9(2157)Online publication date: 10-Sep-2024
https://doi.org/10.1360/SSI-2024-0042
Xu BGong YGeng XLi YDong CLiu SLiu YJiang BLu Z(2024)ProcSAGE: an efficient host threat detection method based on graph representation learningCybersecurity10.1186/s42400-024-00240-w7:1Online publication date: 25-Aug-2024
https://doi.org/10.1186/s42400-024-00240-w
Show More Cited By

Index Terms

High Fidelity Data Reduction for Big Data Security Dependency Analyses

Recommendations

Data base support for intrusion detection with honeynets
TELE-INFO'07: Proceedings of the 6th WSEAS Int. Conference on Telecommunications and Informatics

As computer attacks are becoming more and more difficult to identify the need for better and more efficient intrusion detection systems increases. The main problem with current intrusion detection systems is high rate of false alarms. In this paper we ...
Exploring Three-dimensional Visualization for Intrusion Detection
VIZSEC '05: Proceedings of the IEEE Workshops on Visualization for Computer Security

Intrusion detection systems have been popular tools in the battle against adversaries who, for whatever reason, desire to break into networks, compromise hosts, and steal valuable information. One problem with current implementations, however, is the ...
Detecting, validating and characterizing computer infections in the wild
IMC '11: Proceedings of the 2011 ACM SIGCOMM conference on Internet measurement conference

Although network intrusion detection systems (IDSs) have been studied for several years, their operators are still overwhelmed by a large number of false-positive alerts. In this work we study the following problem: from a large archive of intrusion ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

CCS '16: Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security

October 2016

1924 pages

ISBN:9781450341394

DOI:10.1145/2976749

General Chairs:
Edgar Weippl
SBA Research, Austria
,
Stefan Katzenbeisser
TU Darmstadt, CYSEC, Germany
,
Program Chairs:
Christopher Kruegel
University of California, Santa Barbara, USA
,
Andrew Myers
Cornell University, USA
,
Shai Halevi
IBM Research, USA

Copyright © 2016 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGSAC: ACM Special Interest Group on Security, Audit, and Control

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 October 2016

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

CCS'16

Sponsor:

SIGSAC

CCS'16: 2016 ACM SIGSAC Conference on Computer and Communications Security

October 24 - 28, 2016

Vienna, Austria

Acceptance Rates

CCS '16 Paper Acceptance Rate 137 of 831 submissions, 16%;

Overall Acceptance Rate 1,261 of 6,999 submissions, 18%

Upcoming Conference

CCS '25

Sponsor:
sigsac

ACM SIGSAC Conference on Computer and Communications Security

October 13 - 17, 2025

Taipei , Taiwan

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

102
Total Citations
View Citations
1,636
Total Downloads

Downloads (Last 12 months)132
Downloads (Last 6 weeks)12

Reflects downloads up to 13 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Li LChen W(2024)ConGraph: Advanced Persistent Threat Detection Method Based on Provenance Graph Combined with Process Context in Cyber-Physical System EnvironmentElectronics10.3390/electronics1305094513:5(945)Online publication date: 29-Feb-2024
https://doi.org/10.3390/electronics13050945
CHEN CWAN HZHAO X(2024)Log refusion: adversarial attacks against the integrity of application logs and defense methodsSCIENTIA SINICA Informationis10.1360/SSI-2024-004254:9(2157)Online publication date: 10-Sep-2024
https://doi.org/10.1360/SSI-2024-0042
Xu BGong YGeng XLi YDong CLiu SLiu YJiang BLu Z(2024)ProcSAGE: an efficient host threat detection method based on graph representation learningCybersecurity10.1186/s42400-024-00240-w7:1Online publication date: 25-Aug-2024
https://doi.org/10.1186/s42400-024-00240-w
Zhang BGao YKuang BYu CFu ASusilo W(2024)A Survey on Advanced Persistent Threat Detection: A Unified Framework, Challenges, and CountermeasuresACM Computing Surveys10.1145/370074957:3(1-36)Online publication date: 11-Nov-2024
https://dl.acm.org/doi/10.1145/3700749
Sun HWang SWang ZJiang ZHan DYang J(2024)AudiTrim: A Real-time, General, Efficient, and Low-overhead Data Compaction System for Intrusion DetectionProceedings of the 27th International Symposium on Research in Attacks, Intrusions and Defenses10.1145/3678890.3679048(263-277)Online publication date: 30-Sep-2024
https://dl.acm.org/doi/10.1145/3678890.3679048
Sang AWang YYang LJia JZhou L(2024)Obfuscating Provenance-Based Forensic Investigations with Mapping System Meta-BehaviorProceedings of the 27th International Symposium on Research in Attacks, Intrusions and Defenses10.1145/3678890.3678916(248-262)Online publication date: 30-Sep-2024
https://dl.acm.org/doi/10.1145/3678890.3678916
Fei KZhou J(2024)An Insider Threat Investigation Method by Graph Analysis with Log TextsProceedings of the 2024 3rd International Conference on Networks, Communications and Information Technology10.1145/3672121.3672126(19-23)Online publication date: 7-Jun-2024
https://dl.acm.org/doi/10.1145/3672121.3672126
Bhattarai BHuang H(2024)Prov2vec: Learning Provenance Graph Representation for Anomaly Detection in Computer SystemsProceedings of the 19th International Conference on Availability, Reliability and Security10.1145/3664476.3664494(1-14)Online publication date: 30-Jul-2024
https://dl.acm.org/doi/10.1145/3664476.3664494
Boufford NWonsil JPocock ASullivan JSeltzer MPasquier T(2024)Computational Experiment Comprehension using Provenance SummarizationProceedings of the 2nd ACM Conference on Reproducibility and Replicability10.1145/3641525.3663617(1-19)Online publication date: 18-Jun-2024
https://dl.acm.org/doi/10.1145/3641525.3663617
Sekar RKimm HAich R(2024) eAudit: A Fast, Scalable and Deployable Audit Data Collection System * 2024 IEEE Symposium on Security and Privacy (SP)10.1109/SP54263.2024.00087(3571-3589)Online publication date: 19-May-2024
https://doi.org/10.1109/SP54263.2024.00087
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents