DEGNN: A Deep Learning-Based Method for Unmanned Aerial Vehicle Software Security Analysis
<p>Binary function dual-embedding feature extraction.</p> "> Figure 2
<p>Binary function similarity prediction network.</p> "> Figure 3
<p>Model performance on the validation set.</p> "> Figure 4
<p>ROC curves and AUC scores on the same-architecture comparison dataset.</p> "> Figure 5
<p>ROC curves and AUC scores on the cross-architecture comparison dataset.</p> "> Figure 6
<p>Time efficiency comparison.</p> ">
Abstract
:1. Introduction
- We introduce a dual-embedding scheme for functions based on CECGs and GNNs. This scheme effectively circumvents the need for language model pre-training inherent in NLP representation methods and enhances the representation of calling information during node feature initialization.
- We propose a binary function similarity analysis method utilizing a neural tensor network (NTN) [8] and node statistical matching. This method transforms the problem of binary function similarity analysis into one of similarity score predictions, enabling analyses and similarity predictions of binary functions at both the node and graph levels.
- We have developed and conducted extensive experiments on DEGNN. The experimental results showcase DEGNN’s superiority over state-of-the-art (SOTA) methods. The code for DEGNN is publicly available at https://github.com/kidding1412/DEGNN, accessed on 1 January 2025.
2. Background
2.1. BCSA Technology
2.2. Cross-Architecture Challenges
2.3. Problem Definition
2.3.1. Definition 1: Function
2.3.2. Definition 2: Similarity
3. Related Work
3.1. Semantic Feature-Based Methods
3.2. Structural Feature-Based Methods
3.3. Methods Combining Semantic and Structural Features
4. DEGNN Approach
4.1. Overview
4.2. Function Dual-Embedding Feature Extraction
4.2.1. Constructing CECGs
4.2.2. Node Feature Extraction
4.2.3. Function Feature Extraction
4.3. Similarity Prediction Network
4.3.1. Neural Tensor Network
4.3.2. Node Comparison
4.3.3. Fully Connected Network
5. Experimentation and Evaluation
5.1. Dataset
5.2. Baseline
5.3. Parameter Settings and Model Training
5.4. Evaluation Metrics
5.5. Same-Architecture BCSA Tasks
5.5.1. Experiment 1: One-to-One Binary Function Similarity Matching
5.5.2. Experiment 2: Same-Architecture Binary Function Search
5.6. Cross-Architectural BCSA Tasks
5.6.1. Experiment 1: One-to-One Binary Function Similarity Matching
5.6.2. Experiment 2: Cross-Architectural Binary Function Search
5.7. Ablation Experiment
5.8. Vulnerability Search
5.9. Time Efficiency Analysis
6. Discussion
7. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Conti, F.C.; Santoro, C.; Santoro, F.F. Twinflie: A Digital Twin UAV Orchestrator and Simulator. In Proceedings of the 2023 IEEE Intl Conf on Dependable, Autonomic and Secure Computing, Intl Conf on Pervasive Intelligence and Computing, Intl Conf on Cloud and Big Data Computing, Intl Conf on Cyber Science and Technology Congress (DASC/PiCom/CBDCom/CyberSciTech), Abu Dhabi, United Arab Emirates, 14–17 November 2023; pp. 258–263. [Google Scholar] [CrossRef]
- D’Urso, F.; Santoro, C.; Santoro, F.F. Integrating Heterogeneous Tools for Physical Simulation of multi-Unmanned Aerial Vehicles. In Proceedings of the 19th Workshop from Objects to Agents, Palermo, Italy, 28–29 June 2018; pp. 10–15. Available online: https://ceur-ws.org/Vol-2215/paper_2.pdf (accessed on 1 January 2025).
- Qu, Y.; Dai, H.; Zhuang, Y.; Chen, J.; Dong, C.; Wu, F.; Guo, S. Decentralized Federated Learning for UAV Networks: Architecture, Challenges, and Opportunities. IEEE Netw. 2021, 35, 156–162. [Google Scholar] [CrossRef]
- Wazid, M.; Bera, B.; Das, A.K.; Garg, S.; Niyato, D.; Hossain, M.S. Secure Communication Framework for Blockchain-Based Internet of Drones-Enabled Aerial Computing Deployment. IEEE Internet Things Mag. 2021, 4, 120–126. [Google Scholar] [CrossRef]
- Miao, S.; Pan, Q. Risk Assessment of UAV Cyber Range Based on Bayesian–Nash Equilibrium. Drones 2024, 8, 556. [Google Scholar] [CrossRef]
- Kim, D.; Kim, E.; Cha, S.K.; Son, S.; Kim, Y. Revisiting Binary Code Similarity Analysis Using Interpretable Feature Engineering and Lessons Learned. IEEE Trans. Softw. Eng. 2022, 49, 1661–1682. [Google Scholar] [CrossRef]
- Qasem, A.; Debbabi, M.; Lebel, B.; Kassouf, M. Binary Function Clone Search in the Presence of Code Obfuscation and Optimization over Multi-CPU Architectures. In Proceedings of the ASIA CCS ’23: 2023 ACM Asia Conference on Computer and Communications Security, Melbourne, VIC, Australia, 10–14 July 2023; pp. 443–456. [Google Scholar] [CrossRef]
- Socher, R.; Chen, D.; Manning, C.D.; Ng, A.Y. Reasoning with Neural Tensor Networks for Knowledge Base Completion. In Proceedings of the Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013, Lake Tahoe, NV, USA, 5–8 December 2013; pp. 926–934. [Google Scholar]
- Wang, H.; Qu, W.; Katz, G.; Zhu, W.; Gao, Z.; Qiu, H.; Zhuge, J.; Zhang, C. jTrans: Jump-Aware Transformer for Binary Code Similarity Detection. In Proceedings of the ISSTA ’22: 31st ACM SIGSOFT International Symposium on Software Testing and Analysis, Virtual Event, Republic of Korea, 18–22 July 2022; pp. 1–13. [Google Scholar] [CrossRef]
- Luo, Z.; Wang, P.; Wang, B.; Tang, Y.; Xie, W.; Zhou, X.; Liu, D.; Lu, K. VulHawk: Cross-architecture Vulnerability Detection with Entropy-based Binary Code Search. In Proceedings of the 30th Annual Network and Distributed System Security Symposium, NDSS 2023, San Diego, CA, USA, 27 February–3 March 2023. [Google Scholar]
- Synopsys, I. Heartbleed Bug. 2020. Available online: https://heartbleed.com/ (accessed on 1 January 2025).
- Zuo, F.; Li, X.; Young, P.; Luo, L.; Zeng, Q.; Zhang, Z. Neural Machine Translation Inspired Binary Code Similarity Comparison beyond Function Pairs. In Proceedings of the 26th Annual Network and Distributed System Security Symposium, NDSS 2019, San Diego, CA, USA, 24–27 February 2019. [Google Scholar]
- Hochreiter, S.; Schmidhuber, J. Long Short-Term Memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef]
- Massarelli, L.; Luna, G.A.D.; Petroni, F.; Baldoni, R.; Querzoni, L. SAFE: Self-Attentive Function Embeddings for Binary Similarity. In Detection of Intrusions and Malware, and Vulnerability Assessment, Proceedings of the 16th International Conference, DIMVA 2019, Gothenburg, Sweden, 19–20 June 2019, Proceedings; Perdisci, R., Maurice, C., Giacinto, G., Almgren, M., Eds.; Lecture Notes in Computer Science; Springer: Berlin/Heidelberg, Germany, 2019; Volume 11543, pp. 309–329. [Google Scholar] [CrossRef]
- Mikolov, T.; Chen, K.; Corrado, G.; Dean, J. Efficient Estimation of Word Representations in Vector Space. In Proceedings of the 1st International Conference on Learning Representations, ICLR 2013, Scottsdale, AZ, USA, 2–4 May 2013. [Google Scholar]
- Lin, Z.; Feng, M.; dos Santos, C.N.; Yu, M.; Xiang, B.; Zhou, B.; Bengio, Y. A Structured Self-Attentive Sentence Embedding. In Proceedings of the 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, 24–26 April 2017. [Google Scholar]
- Tian, D.; Jia, X.; Ma, R.; Liu, S.; Liu, W.; Hu, C. BinDeep: A Deep Learning Approach to Binary Code Similarity Detection. Expert Syst. Appl. 2021, 168, 114348. [Google Scholar] [CrossRef]
- Lipton, Z.C.; Berkowitz, J.; Elkan, C. A Critical Review of Recurrent Neural Networks for Sequence Learning. arXiv 2015, arXiv:1506.00019. [Google Scholar]
- Collyer, J.; Watson, T.; Phillips, I. FASER: Binary Code Similarity Search through the Use of Intermediate Representations. In Proceedings of the Conference on Applied Machine Learning in Information Security, Arlington, VA, USA, 19–20 October 2023; Volume 3652, pp. 193–202. [Google Scholar]
- Radareorg. Radare2: UNIX—Like Reverse Engineering Framework and Command—Line Toolset (Version 5.9.9). GitHub. Available online: https://github.com/radareorg/radare2/releases (accessed on 1 January 2025).
- Beltagy, I.; Peters, M.E.; Cohan, A. Longformer: The Long-Document Transformer. arXiv 2020, arXiv:2004.05150. [Google Scholar]
- Xu, X.; Liu, C.; Feng, Q.; Yin, H.; Song, L.; Song, D. Neural Network-based Graph Embedding for Cross-Platform Binary Code Similarity Detection. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, Dallas, TX, USA, 30 October–3 November 2017; pp. 363–376. [Google Scholar] [CrossRef]
- Gao, J.; Yang, X.; Fu, Y.; Jiang, Y.; Sun, J. VulSeeker: A Semantic Learning Based Vulnerability Seeker for Cross-Platform Binary. In Proceedings of the 2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE), Montpellier, France, 3–7 September 2018; pp. 896–899. [Google Scholar] [CrossRef]
- Dai, H.; Dai, B.; Song, L. Discriminative Embeddings of Latent Variable Models for Structured Data. In Proceedings of the 33nd International Conference on Machine Learning, ICML 2016, New York, NY, USA, 19–24 June 2016; Volume 48, pp. 2702–2711. [Google Scholar]
- Luo, M.; Yang, C.; Gong, X.; Yu, L. FuncNet: A Euclidean Embedding Approach for Lightweight Cross-platform Binary Recognition. In Security and Privacy in Communication Networks, Proceedings of the 15th EAI International Conference, SecureComm 2019, Orlando, FL, USA, 23–25 October 2019; Chen, S., Choo, K.K.R., Fu, X., Lou, W., Mohaisen, A., Eds.; Springer: Cham, Switzerland, 2019; pp. 319–337. [Google Scholar]
- Li, Y.; Gu, C.; Dullien, T.; Vinyals, O.; Kohli, P. Graph Matching Networks for Learning the Similarity of Graph Structured Objects. In Proceedings of the 36th International Conference on Machine Learning, ICML 2019, Long Beach, CA, USA, 9–15 June 2019; Volume 97, pp. 3835–3845. [Google Scholar]
- Yang, S.; Dong, C.; Xiao, Y.; Cheng, Y.; Shi, Z.; Li, Z.; Sun, L. Asteria-Pro: Enhancing Deep Learning-based Binary Code Similarity Detection by Incorporating Domain Knowledge. ACM Trans. Softw. Eng. Methodol. 2024, 33, 1:1–1:40. [Google Scholar] [CrossRef]
- Tai, K.S.; Socher, R.; Manning, C.D. Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, ACL 2015, Beijing, China, 26–31 July 2015; Volume 1, pp. 1556–1566. [Google Scholar] [CrossRef]
- Ding, S.H.H.; Fung, B.C.M.; Charland, P. Asm2Vec: Boosting Static Representation Robustness for Binary Clone Search against Code Obfuscation and Compiler Optimization. In Proceedings of the 2019 IEEE Symposium on Security and Privacy (SP), San Francisco, CA, USA, 19–23 May 2019; pp. 472–489. [Google Scholar] [CrossRef]
- Le, Q.V.; Mikolov, T. Distributed Representations of Sentences and Documents. In Proceedings of the 31th International Conference on Machine Learning, ICML 2014, Beijing, China, 21–26 June 2014; Volume 32, pp. 1188–1196. [Google Scholar]
- Massarelli, L.; Luna, G.; Petroni, F.; Querzoni, L. Investigating Graph Embedding Neural Networks with Unsupervised Features Extraction for Binary Analysis. In Proceedings of the Workshop on Binary Analysis Research (BAR) 2019, San Diego, CA, USA, 24 February 2019. [Google Scholar] [CrossRef]
- Duan, Y.; Li, X.; Wang, J.; Yin, H. DeepBinDiff: Learning Program-Wide Code Representations for Binary Diffing. In Proceedings of the Network and Distributed Systems Security (NDSS) Symposium 2020, San Diego, CA, USA, 23–26 February 2020. [Google Scholar] [CrossRef]
- Yu, Z.; Cao, R.; Tang, Q.; Nie, S.; Huang, J.; Wu, S. Order Matters: Semantic-Aware Neural Networks for Binary Code Similarity Detection. In Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, AAAI 2020, The Thirty-Second Innovative Applications of Artificial Intelligence Conference, IAAI 2020, The Tenth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2020, New York, NY, USA, 7–12 February 2020; pp. 1145–1152. [Google Scholar]
- Yang, J.; Fu, C.; Liu, X.Y.; Yin, H.; Zhou, P. Codee: A Tensor Embedding Scheme for Binary Code Search. IEEE Trans. Softw. Eng. 2022, 48, 2224–2244. [Google Scholar] [CrossRef]
- Devlin, J.; Chang, M.W.; Lee, K.; Toutanova, K. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, 2–7 June 2019; Volume 1, pp. 4171–4186. [Google Scholar] [CrossRef]
- Gilmer, J.; Schoenholz, S.S.; Riley, P.F.; Vinyals, O.; Dahl, G.E. Neural Message Passing for Quantum Chemistry. In Proceedings of the 34th International Conference on Machine Learning, ICML 2017, Sydney, NSW, Australia, 6–11 August 2017; Volume 70, pp. 1263–1272. [Google Scholar]
- Wang, H.; Gao, Z.; Zhang, C.; Sun, M.; Zhou, Y.; Qiu, H.; Xiao, X. CEBin: A Cost-Effective Framework for Large-Scale Binary Code Similarity Detection. In Proceedings of the ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA 2024), Vienna, Austria, 16–20 September 2024. [Google Scholar] [CrossRef]
- Wang, H.; Gao, Z.; Zhang, C.; Sha, Z.; Sun, M.; Zhou, Y.; Zhu, W.; Sun, W.; Qiu, H.; Xiao, X. CLAP: Learning Transferable Binary Code Representations with Natural Language Supervision. In Proceedings of the ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA 2024), Vienna, Austria, 16–20 September 2024. [Google Scholar] [CrossRef]
- Kipf, T.N.; Welling, M. Semi-Supervised Classification with Graph Convolutional Networks. arXiv 2016, arXiv:1609.02907. [Google Scholar]
- Velickovic, P.; Cucurull, G.; Casanova, A.; Romero, A.; Liò, P.; Bengio, Y. Graph Attention Networks. In Proceedings of the 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, 30 April–3 May 2018. [Google Scholar]
Position | Content |
---|---|
0–2 | Node ID |
3–7 | Node length |
8–10 | Constant number |
11–13 | Number of calls |
14–16 | Number of strings |
17–19 | Out-degree |
20–22 | In-degree |
23 | Compiler |
24–25 | Architecture |
26 | Bit |
27–28 | Optimization |
29 | Ret flag |
30 | First block flag |
31 | Instruction block flag |
Position | Content |
---|---|
0–2 | Node ID |
3–17 | Call hash |
17–18 | Call count in the function |
19–20 | Called count in the function |
21–22 | Called count |
23–24 | Call count in the function |
25–26 | Call count |
27–28 | External call count in the function |
29–30 | Total external call count |
31 | Call block flag |
MRR | Recall@1 | |||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Model | 00, O3 | 01, O3 | 02, O3 | 00, Os | 01, Os | 02, Os | Avg. | 00, O3 | 01, O3 | 02, O3 | 00, Os | 01, Os | 02, Os | Avg. |
SAFE | 0.122 | 0.252 | 0.617 | 0.155 | 0.361 | 0.385 | 0.315 | 0.091 | 0.263 | 0.609 | 0.102 | 0.283 | 0.326 | 0.279 |
jTrans | 0.481 | 0.682 | 0.749 | 0.552 | 0.683 | 0.697 | 0.641 | 0.392 | 0.603 | 0.687 | 0.458 | 0.627 | 0.688 | 0.576 |
Asteria-Pro | 0.463 | 0.650 | 0.711 | 0.527 | 0.662 | 0.681 | 0.616 | 0.301 | 0.554 | 0.627 | 0.417 | 0.591 | 0.675 | 0.528 |
DEGNN-noCECG | 0.302 | 0.463 | 0.652 | 0.331 | 0.519 | 0.541 | 0.468 | 0.224 | 0.481 | 0.528 | 0.283 | 0.426 | 0.509 | 0.409 |
DEGNN-noTNT | 0.429 | 0.587 | 0.672 | 0.481 | 0.605 | 0.614 | 0.565 | 0.287 | 0.560 | 0.608 | 0.323 | 0.419 | 0.653 | 0.475 |
DEGNN | 0.498 | 0.701 | 0.772 | 0.581 | 0.696 | 0.713 | 0.660 | 0.405 | 0.631 | 0.702 | 0.492 | 0.641 | 0.695 | 0.594 |
MRR | Recall@1 | |||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Model | 00, O3 | 01, O3 | 02, O3 | 00, Os | 01, Os | 02, Os | Avg. | 00, O3 | 01, O3 | 02, O3 | 00, Os | 01, Os | 02, Os | Avg. |
SAFE | 0.101 | 0.217 | 0.519 | 0.127 | 0.314 | 0.326 | 0.267 | 0.078 | 0.229 | 0.528 | 0.088 | 0.244 | 0.288 | 0.243 |
Asteria-Pro | 0.431 | 0.574 | 0.668 | 0.489 | 0.617 | 0.644 | 0.571 | 0.256 | 0.471 | 0.533 | 0.354 | 0.502 | 0.574 | 0.448 |
DEGNN-noCECG | 0.277 | 0.429 | 0.613 | 0.302 | 0.487 | 0.522 | 0.438 | 0.190 | 0.409 | 0.449 | 0.240 | 0.362 | 0.433 | 0.347 |
DEGNN-noTNT | 0.403 | 0.552 | 0.656 | 0.462 | 0.561 | 0.579 | 0.536 | 0.244 | 0.482 | 0.517 | 0.275 | 0.356 | 0.555 | 0.405 |
DEGNN | 0.471 | 0.689 | 0.752 | 0.567 | 0.663 | 0.691 | 0.639 | 0.392 | 0.607 | 0.683 | 0.477 | 0.609 | 0.687 | 0.576 |
Model | SAFE | Asteria-Pro | DEGNN |
---|---|---|---|
Average Recall@1 | 0.25 | 0.60 | 0.80 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Du, J.; Wei, Q.; Wang, Y.; Bai, X. DEGNN: A Deep Learning-Based Method for Unmanned Aerial Vehicle Software Security Analysis. Drones 2025, 9, 110. https://doi.org/10.3390/drones9020110
Du J, Wei Q, Wang Y, Bai X. DEGNN: A Deep Learning-Based Method for Unmanned Aerial Vehicle Software Security Analysis. Drones. 2025; 9(2):110. https://doi.org/10.3390/drones9020110
Chicago/Turabian StyleDu, Jiang, Qiang Wei, Yisen Wang, and Xingyu Bai. 2025. "DEGNN: A Deep Learning-Based Method for Unmanned Aerial Vehicle Software Security Analysis" Drones 9, no. 2: 110. https://doi.org/10.3390/drones9020110
APA StyleDu, J., Wei, Q., Wang, Y., & Bai, X. (2025). DEGNN: A Deep Learning-Based Method for Unmanned Aerial Vehicle Software Security Analysis. Drones, 9(2), 110. https://doi.org/10.3390/drones9020110