Abstract
As the underground market of malware flourishes, there is an exponential increase in the number and diversity of malware. A crucial question in malware analysis research is how to define malware specifications or signatures that faithfully describe similar malicious intent and also clearly stand out from other programs. Although the traditional malware specifications based on syntactic signatures are efficient, they can be easily defeated by various obfuscation techniques. Since the malicious behavior is often stable across similar malware instances, behavior-based specifications which capture real malicious characteristics during run time, have become more prevalent in anti-malware tasks, such as malware detection and malware clustering. This kind of specification is typically extracted from the system call dependence graph that a malware sample invokes. In this paper, we present replacement attacks to camouflage similar behaviors by poisoning behavior-based specifications. The key method of our attacks is to replace a system call dependence graph to its semantically equivalent variants so that the similar malware samples within one family turn out to be different. As a result, malware analysts have to put more efforts into reexamining the similar samples which may have been investigated before. We distil general attacking strategies by mining more than 5200 malware samples’ behavior specifications and implement a compiler-level prototype to automate replacement attacks. Experiments on 960 real malware samples demonstrate the effectiveness of our approach to impede various behavior-based malware analysis tasks, such as similarity comparison and malware clustering. In the end, we also discuss possible countermeasures in order to strengthen existing malware defense.
Similar content being viewed by others
Notes
References
Cybercriminals sell access to tens of thousands of malware-infected Russian hosts. http://www.webroot.com/blog/2013/09/23/. Last reviewed 10/03/2014
Getting started with the LLVM system using Microsoft Visual Studio. http://llvm.org/docs/GettingStartedVS.html. Last reviewed 10/03/2014
Malicious software and its underground economy. https://www.coursera.org/course/malsoftware. Last reviewed 10/03/2014
Windows registry persistence, part 2: the run keys and search-order. http://blog.cylance.com. Last reviewed 10/03/2014
Andoni, A., Indyk, P.: Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. Commun. ACM 51(1) (2008)
Babić, D., Reynaud, D., Song, D.: Malware analysis with tree automata inference. In: Proceedings of the 23rd International Conference on Computer Aided Verification (CAV’11) (2011)
Bayer, U., Comparetti, P.M., Hlauschek, C., Kruegel, C., Kirda, E.: Scalable, behavior based malware clustering. In: Proceedings of the Network and Distributed System Security Symposium (NDSS’09) (2009)
Bayer, U., Kirda, E., Kruegel, C.: Improving the efficiency of dynamic malware analysis. In: Proceedings of the 2010 ACM Symposium on Applied Computing (SAC’10) (2010)
Biggio, B., Pillai, I., Rota Bulò, S., Ariu, D., Pelillo, M., Roli, F.: Is data clustering in adversarial settings secure? In: Proceedings of the 6th ACM Workshop on Artificial Intelligence and Security (AISec’13) (2013)
Biggio, B., Rieck, K., Ariu, D., Wressnegger, C., Corona, I., Giacinto, G., Rol, F.: Poisoning behavioral malware clustering. In: Proceedings of the 7th ACM Workshop on Artificial Intelligence and Security (AISec’14) (2014)
Broder, A.Z., Glassman, S.C., Manasse, M.S., Zweig, G.: Syntactic clustering of the web. In: Proceedings of the Sixth International Conference on World Wide Web (1997)
Bruschi, D., Martignoni, L., Monga, M.: Detecting self-mutating malware using control-flow graph matching. In: Proceedings of Detection of Intrusions and Malware and Vulnerability Assessment (DIMVA’06) (2006)
Bunke, H., Shearer, K.: A graph distance metric based on the maximal common subgraph. Pattern Recognit. Lett. 19(3–4), 255–259 (1998)
Chen, X., Andersen, J., Mao, Z., Bailey, M., Nazario, J.: Towards an understanding of anti-virtualization and anti-debugging behavior in modern malware. In: Proceedings of the International Conference on Dependable Systems and Networks (DSN’08) (2008)
Christodorescu, M., Jha, S., Kruegel, C.: Mining specifications of malicious behavior. In: ESEC-FSE’ 07: Proceedings of the the 6th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering (2007)
Coogan, K., Lu, G., Debray, S.: Deobfuscation of virtualization-obfuscated software. In: Proceedings of the 18th ACM Conference on Computer and Communications Security (CCS’11) (2011)
Dinaburg, A., Royal, P., Sharif, M., Lee, W.: Ether: malware analysis via hardware virtualization extensions. In: Proceedings of the ACM Conference on Computer and Communications Security (CCS’08) (2008)
Ferrie, P.: Attacks on virtual machines. In: Proceedings of the Association of Anti-Virus Asia Researchers Conference (2007)
Forrest, S., Hofmeyr, S., Somayaji, A.: The evolution of system-call monitoring. In: Proceedings of the 2008 Annual Computer Security Applications Conference (ACSAC’08) (2008)
Fredrikson, M., Jha, S., Christodorescu, M., Sailer, R., Yan, X.: Synthesizing near-optimal malware specifications from suspicious behaviors. In: Proceedings of the 2010 IEEE Symposium on Security and Privacy (2010)
Jacob, G., Hund, R., Kruegel, C., Holz, T.: Jackstraws: Picking command and control connections from bot traffic. In: Proceedings of the 20th USENIX Conference on Security (2011)
Kang, M.G., Yin, H., Hanna, S., McCamant, S., Song, D.: Emulating emulation-resistant malware. In: Proceedings of the Workshop on Virtual Machine Security (VMSec’09) (2009)
Kim, H., Khoo, W.M., Lio, P.: Polymorphic attacks against sequence-based software birthmarks. In: Proceedings of the 2nd Software Security and Protection Workshop (SSP’12) (2012)
Kinable, J., Kostakis, O.: Malware classification based on call graph clustering. J. Comput. Virol. 7(4), 233–245 (2011)
Kolbitsch, C., Comparetti, P.M., Kruegel, C., Kirda, E., Zho, X., Wang, X.: Effective and efficient malware detection at the end host. In: Proceedings of the 18th USENIX Security Symposium (2009)
Kruegel, C., Kirda, E., Mutz, D., Robertson, W., Vigna, G.: Polymorphic worm detection using structural information of executables. In: Proceedings of Symposium on Recent Advances in Intrusion Detection (RAID’05) (2005)
Lattner, C., Adve, V.: LLVM: A compilation framework for lifelong program analysis and transformation. In: Proceedings of the International Symposium on Code Generation and Optimization (CGO’04) (2004)
Lindorfer, M., Di Federico, A., Maggi, F., Comparetti, P.M., Zanero, S.: Lines of malicious code: insights into the malicious software industry. In: Proceedings of the 28th Annual Computer Security Applications Conference (ACSAC’12) (2012)
Ma, W., Duan, P., Liu, S., Gu, G., Liu, J.-C.: Shadow attacks: automatically evading system-call-behavior based malware detection. Comput. Virol. 8(1–2), 1–13 (2012)
Martignoni, L., Stinson, E., Fredrikson, M., Jha, S., Mitchell, J.C.: A layered architecture for detecting malicious behaviors. In: Proceedings of the 10th International Symposium on Recent Advances in Intrusion Detection (RAID’08) (2008)
Ming, J., Xin, Z., Lan, P., Wu, D., Liu, P., Mao, B.: Replacement attacks: automatically impeding behavior-based malware specifications. In: Proceedings of the 13th International Conference on Applied Cryptography and Network Security (ACNS 2015) (2015)
Moser, A., Kruegel, C., Kirda, E.: Limits of static analysis for malware detection. In: Proceedings of the 23th Annual Computer Security Applications Conference (ACSAC’07) (2007)
Paleari, R., Martignoni, L., Passerini, E., Davidson, D., Fredrikson, M., Giffin, J., Jha. S.: Automatic generation of remediation procedures for malware infections. In: Proceedings of the 19th USENIX Security Symposium (2010)
Paleari, R., Martignoni, L., Roglia, G.F., Bruschi, D.: A fistful of red-pills: How to automatically generate procedures to detect cpu emulators. In: Proceedings of the USENIX Workshop on Offensive Technologies (WOOT’09) (2009)
Park, Y., Reeves, D.: Deriving common malware behavior through graph clustering. In: Proceedings of the 6th ACM Symposium on Information, Computer and Communications Security (ASIACCS’11) (2011)
Park, Y., Reeves, D., Mulukutla, V., Sundaravel, B.: Fast malware classification by automated behavioral graph matching. In: Proceedings of the 6th Annual Workshop on Cyber Security and Information Intelligence Research (2010)
Raffetseder, T., Kruegel, C., Kirda, E.: Detecting system emulators. In: Proceedings of the 10th International Conference on Information Security (ISC’07) (2007)
Rieck, K., Trinius, P., Willems, C., Holz, T.: Automatic analysis of malware behavior using machine learning. J. Comput. Secur. 19(4), 639–668 (2011)
Roundy, K.A., Miller, B.P.: Binary-code obfuscations in prevalent packer tools. ACM Comput. Surv. 46(1), 4 (2013)
Russinovich, M.: Inside the native api. http://netcode.cz/img/83/nativeapi.html. Last reviewed 10/03/2014
Sikorski, M., Honig, A.: Practical Malware Analysis: The Hands-On Guide to Dissecting Malicious Software. No Starch Press (2012)
Srivastava, A., Lanzi, A., Giffin, J., Balzarotti, D.: Operating system interface obfuscation and the revealing of hidden operations. In: Proceedings of the Detection of Intrusions and Malware and Vulnerability Assessment (DIMVA’11) (2011)
Wagner, D., Soto, P.: Mimicry attacks on host-based intrusion detection systems. In: Proceedings of the 9th ACM Conference on Computer and Communications Security (CCS’02) (2002)
Wang, Z., Ming, J., Jia, C., Gao, D.: Linear obfuscation to combat symbolic execution. In: Proceedings of the 2011 European Symposium on Research in Computer Security (ESORICS’11) (2011)
Xin, Z., Chen, H., Wang, X.C., Liu, P., Zhu, S., Mao, B.: Replacement attacks on behavior based software birthmark. In: Proceedings of the 14th Information Security Conference (ISC’11) (2011)
Acknowledgments
We are very grateful to Paolo Milani Comparetti and Christopher Kruegel for providing access to the BCHKK-data dataset. This research was supported in part by the Grants NSF CNS-1223710, NSF CCF-1320605, ONR N00014-13-1-0175, and ARO W911NF-13-1-0421 (MURI). Peng Liu was supported by ARO W911NF-13-1-0421 (MURI), NSF CCF-1320605, CNS-1422594, and NIETP CAE Cybersecurity Grant.
Author information
Authors and Affiliations
Corresponding author
Additional information
A preliminary version of this paper appeared in the Proceedings of the 13th International Conference on Applied Cryptography and Network Security (ACNS’15) , New York, June 2–5, 2015 [31].
Rights and permissions
About this article
Cite this article
Ming, J., Xin, Z., Lan, P. et al. Impeding behavior-based malware analysis via replacement attacks to malware specifications. J Comput Virol Hack Tech 13, 193–207 (2017). https://doi.org/10.1007/s11416-016-0281-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11416-016-0281-3