FSmell: Recognizing Inline Function in Binary Code

Wei Lin^11,12,
Qingli Guo¹¹,
Jiawei Yin^11,12,
Xiangyu Zuo^11,12,
Rongqing Wang^11,12 &
…
Xiaorui Gong^11,12

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14345))

Included in the following conference series:

European Symposium on Research in Computer Security

736 Accesses

Abstract

Function recognition is one of the most critical tasks in binary analysis and reverse engineering. However, the recognition of inline functions still remains challenging. This is mainly due to two factors. Firstly, in binaries, there exist no expert patterns, e.g., prologue/epilogue instructions, for inline functions. Secondly, instruction reordering introduced by compiler optimization makes the address space of the instruction from the same inline function discontinuous. The address space of an inline function is often mingled with that of regular functions. This paper proposes FSmell, a graph theory based function recognition framework that specifically targets inline functions. FSmell introduces Instruction Topology Graph (ITG) to represent the data flow dependencies for instructions in a basic block. With the help of ITG, the problem of distinguishing inline instructions from caller instructions is transformed into the graph connectivity problem, which is solved by computing the minimum vertex separator. We have applied FSmell to analyze 78 binaries compiled by GCC and CLANG with 3 different optimization levels. Of the 205,890 inline functions in the 78 binaries, FSmell reports 76,777, with a precision of 67.5%, and a recall of 39.2%. With the help of FSmell, 50% of the vulnerabilities missed by other methods are detected and located.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Next-Generation Intermediate Representations for Binary Code Analysis

Article 16 December 2019

RouAlign: Cross-Version Function Alignment and Routine Recovery with Graphlet Edge Embedding

Matching Function-Call Graph of Binary Codes and Its Applications (Short Paper)

References

Perkins, J.H., et al.: Automatically patching errors in deployed software. In: Proceedings of the ACM SIGOPS 22nd Symposium on Operating Systems Principles, pp. 87–102 (2009)
Google Scholar
Cesare, S., Xiang, Y., Zhou, W.: Control flow-based malware VariantDetection. IEEE Trans. Dependable Secure Comput. 11(4), 307–317 (2013)
Google Scholar
Gu, F., et al.: $\{$COMRace$\}$: detecting data race vulnerabilities in $\{$COM$\}$ objects. In: 31st USENIX Security Symposium (USENIX Security 2022), pp. 3019–3036 (2022)
Google Scholar
Xu, X., Liu, C., Feng, Q., Yin, H., Song, L., Song, D.: Neural network-based graph embedding for cross-platform binary code similarity detection. In: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, pp. 363–376 (2017)
Google Scholar
Luo, L., Ming, J., Wu, D., Liu, P., Zhu, S.: Semantics-based obfuscation-resilient binary code similarity comparison with applications to software plagiarism detection. In: Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, pp. 389–400 (2014)
Google Scholar
Schwartz, E.J., Lee, J., Woo, M., Brumley, D.: Native x86 decompilation using semantics-preserving structural analysis and iterative control-flow structuring (2013)
Google Scholar
Gussoni, A., Di Federico, A., Fezzardi, P., Agosta, G.: A comb for decompiled C code. In: Proceedings of the 15th ACM Asia Conference on Computer and Communications Security, pp. 637–651 (2020)
Google Scholar
Burk, K., Pagani, F., Kruegel, C., Vigna, G.: Decomperson: how humans decompile and what we can learn from it. In: 31st USENIX Security Symposium (USENIX Security 2022), pp. 2765–2782 (2022)
Google Scholar
Zeping, Yu., Zheng, W., Wang, J., Tang, Q., Nie, S., Shi, W.: CodeCMR: cross-modal retrieval for function-level binary source code matching. In: Advances in Neural Information Processing Systems, vol. 33, pp. 3872–3883 (2020)
Google Scholar
Yuan, Z., et al.: B2SFinder: detecting open-source software reuse in COTS software. In: 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE), pp. 1038–1049. IEEE (2019)
Google Scholar
Ban, G., Lili, X., Xiao, Y., Li, X., Yuan, Z., Huo, W.: B2SMatcher: fine-grained version identification of open-source software in binary files. Cybersecurity 4(1), 1–21 (2021)
Google Scholar
He, J., Ivanov, P., Tsankov, P., Raychev, V., Vechev, M.: Debin: predicting debug information in stripped binaries. In: Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, pp. 1667–1680 (2018)
Google Scholar
Lacomis, J., et al.: DIRE: a neural approach to decompiled identifier naming. In: 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE), pp. 628–639. IEEE (2019)
Google Scholar
Schwartz, E.J., Cohen, C.F., Duggan, M., Gennari, J., Havrilla, J.S., Hines, C.: Using logic programming to recover C++ classes and methods from compiled executables. In: Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, pp. 426–441 (2018)
Google Scholar
Zhang, M., Sekar, R.: Control flow and code integrity for COTS binaries: an effective defense against real-world ROP attacks. In: Proceedings of the 31st Annual Computer Security Applications Conference, pp. 91–100 (2015)
Google Scholar
Abadi, M., Budiu, M., Erlingsson, U., Ligatti, J.: Control-flow integrity principles, implementations, and applications. ACM Trans. Inf. Sys. Secur. (TISSEC) 13(1), 1–40 (2009)
Article Google Scholar
Nethercote, N., Seward, J.: Valgrind: a framework for heavyweight dynamic binary instrumentation. ACM Sigplan Not. 42(6), 89–100 (2007)
Article Google Scholar
Hex Rays. Ida pro (2020). https://www.hex-rays.com/products/ida
Brumley, D., Jager, I., Avgerinos, T., Schwartz, E.J.: BAP: a binary analysis platform. In: Gopalakrishnan, G., Qadeer, S. (eds.) CAV 2011. LNCS, vol. 6806, pp. 463–469. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-22110-1_37
Chapter Google Scholar
Shoshitaishvili, Y., et al.: SOK: (state of) the art of war: offensive techniques in binary analysis. In: 2016 IEEE Symposium on Security and Privacy (SP), pp. 138–157. IEEE (2016)
Google Scholar
Jia, A., et al.: 1-to-1 or 1-to-n? Investigating the effect of function inlining on binary similarity analysis. ACM Trans. Softw. Eng. Methodol. (2022). Just Accepted
Google Scholar
Serrano, M.: Inline expansion: When and how? In: Glaser, H., Hartel, P., Kuchen, H. (eds.) PLILP 1997. LNCS, vol. 1292, pp. 143–157. Springer, Heidelberg (1997). https://doi.org/10.1007/BFb0033842
Chapter Google Scholar
Bao, T., Burket, J., Woo, M., Turner, R., Brumley, D.: $\{$BYTEWEIGHT$\}$: learning to recognize functions in binary code. In: 23rd USENIX Security Symposium (USENIX Security 2014), pp. 845–860 (2014)
Google Scholar
Ahmed, T., Devanbu, P., Sawant, A.A.: Learning to find usages of library functions in optimized binaries. IEEE Trans. Softw. Eng. 48(10), 3862–3876 (2021)
Article Google Scholar
Qiu, J., Su, X., Ma, P.: Using reduced execution flow graph to identify library functions in binary code. IEEE Trans. Softw. Eng. 42(2), 187–202 (2015)
Article Google Scholar
Chandramohan, M., Xue, Y., Xu, Z., Liu, Y., Cho, C.Y., Tan, H.B.K.: BinGo: cross-architecture cross-OS binary search. In: Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering, pp. 678–689 (2016)
Google Scholar
Ding, S.H.H., Fung, B.C.M., Charland, P.: Asm2Vec: boosting static representation robustness for binary clone search against code obfuscation and compiler optimization. In: 2019 IEEE Symposium on Security and Privacy (SP), pp. 472–489. IEEE (2019)
Google Scholar
Guilfanov, I.: Decompiler internals: microcode (2018)
Google Scholar
Lin, Y., Gao, D.: When function signature recovery meets compiler optimization. In: 2021 IEEE Symposium on Security and Privacy (SP), pp. 36–52. IEEE (2021)
Google Scholar
Beyer, D., Fararooy, A.: A simple and effective measure for complex low-level dependencies. In: 2010 IEEE 18th International Conference on Program Comprehension, pp. 80–83. IEEE (2010)
Google Scholar
Yakdan, K., Eschweiler, S., Gerhards-Padilla, E., Smith, M.: No More Gotos: decompilation using pattern-independent control-flow structuring and semantic-preserving transformations. In: NDSS. Citeseer (2015)
Google Scholar
Becker, P., Fowler, M., Beck, K., Brant, J., Opdyke, W., Roberts, D.: Refactoring: Improving the Design of Existing Code. Addison-Wesley Professional, New York (1999)
Google Scholar
Anderson, D.: Libdwarf and dwarfdump (2011)
Google Scholar
Rosenblum, N.E., Zhu, X., Miller, B.P., Hunt, K.: Learning to analyze binary computer code. In: AAAI, pp. 798–804 (2008)
Google Scholar
Shin, E.C.R., Song, D., Moazzezi, R.: Recognizing functions in binaries with neural networks. In: 24th USENIX security symposium (USENIX Security 2015), pp. 611–626 (2015)
Google Scholar
Wang, S., Wang, P., Wu, D.: Semantics-aware machine learning for function recognition in binary code. In: 2017 IEEE International Conference on Software Maintenance and Evolution (ICSME), pp. 388–398. IEEE (2017)
Google Scholar
Pei, K., Guan, J., King, D.W., Yang, J., Jana, S.: XDA: accurate, robust disassembly with transfer learning. In: Proceedings of the 2021 Network and Distributed System Security Symposium (NDSS) (2021)
Google Scholar
Yu, S., Qu, Y., Hu, X., Yin, H.: DeepDi: learning a relational graph convolutional network model on instructions for fast and accurate disassembly. In: Proceedings of the USENIX Security Symposium (2022)
Google Scholar

Download references

Acknowledgement

This research was supported in part by Key Laboratory of Network Assessment Technology (Chinese Academy of Science) and Beijing Key Laboratory of Network Security and Protection Technology.

Author information

Authors and Affiliations

Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China
Wei Lin, Qingli Guo, Jiawei Yin, Xiangyu Zuo, Rongqing Wang & Xiaorui Gong
School of Cyber Security, University of Chinese Academy of Sciences, Beijing, China
Wei Lin, Jiawei Yin, Xiangyu Zuo, Rongqing Wang & Xiaorui Gong

Authors

Wei Lin
View author publications
You can also search for this author in PubMed Google Scholar
Qingli Guo
View author publications
You can also search for this author in PubMed Google Scholar
Jiawei Yin
View author publications
You can also search for this author in PubMed Google Scholar
Xiangyu Zuo
View author publications
You can also search for this author in PubMed Google Scholar
Rongqing Wang
View author publications
You can also search for this author in PubMed Google Scholar
Xiaorui Gong
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Qingli Guo .

Editor information

Editors and Affiliations

University of California, Irvine, CA, USA
Gene Tsudik
University of Padua, Padua, Italy
Mauro Conti
Delft University of Technology, Delft, The Netherlands
Kaitai Liang
Delft University of Technology, Delft, The Netherlands
Georgios Smaragdakis

Appendix

Table 4 presents the information of vulnerabilities associated with inline functions recognized by FSmell. “Caller Functions” refer to functions that invoke inline functions. “CVEs” denote the CVE numbers of vulnerabilities in these inline functions. The column labeled“found?” indicates whether FSmell successfully recognized the boundaries of the inline functions.

Table 4. Vulnerability distribution. Inline functions are the functions found by FSmell.

Full size table

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lin, W., Guo, Q., Yin, J., Zuo, X., Wang, R., Gong, X. (2024). FSmell: Recognizing Inline Function in Binary Code. In: Tsudik, G., Conti, M., Liang, K., Smaragdakis, G. (eds) Computer Security – ESORICS 2023. ESORICS 2023. Lecture Notes in Computer Science, vol 14345. Springer, Cham. https://doi.org/10.1007/978-3-031-51476-0_24

Download citation

DOI: https://doi.org/10.1007/978-3-031-51476-0_24
Published: 11 January 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-51475-3
Online ISBN: 978-3-031-51476-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

FSmell: Recognizing Inline Function in Binary Code

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Next-Generation Intermediate Representations for Binary Code Analysis

RouAlign: Cross-Version Function Alignment and Routine Recovery with Graphlet Edge Embedding

Matching Function-Call Graph of Binary Codes and Its Applications (Short Paper)

References

Acknowledgement

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendix

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

FSmell: Recognizing Inline Function in Binary Code

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Next-Generation Intermediate Representations for Binary Code Analysis

RouAlign: Cross-Version Function Alignment and Routine Recovery with Graphlet Edge Embedding

Matching Function-Call Graph of Binary Codes and Its Applications (Short Paper)

References

Acknowledgement

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendix

Appendix

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation