research-article

Robin: A Novel Method to Produce Robust Interpreters for Deep Learning-Based Code Classifiers

Authors:

Hai JinAuthors Info & Claims

ASE '23: Proceedings of the 38th IEEE/ACM International Conference on Automated Software Engineering

Pages 27 - 39

https://doi.org/10.1109/ASE56229.2023.00164

Published: 26 September 2024 Publication History

Abstract

Deep learning has been widely used in source code classification tasks, such as code classification according to their functionalities, code authorship attribution, and vulnerability detection. Unfortunately, the black-box nature of deep learning makes it hard to interpret and understand why a classifier (i.e., classification model) makes a particular prediction on a given example. This lack of interpretability (or explainability) might have hindered their adoption by practitioners because it is not clear when they should or should not trust a classifier's prediction. The lack of interpretability has motivated a number of studies in recent years. However, existing methods are neither robust nor able to cope with out-of-distribution examples. In this paper, we propose a novel method to produce Robust interpreters for a given deep learning-based code classifier; the method is dubbed Robin. The key idea behind Robin is a novel hybrid structure combining an interpreter and two approximators, while leveraging the ideas of adversarial training and data augmentation. Experimental results show that on average the interpreter produced by Robin achieves a 6.11% higher fidelity (evaluated on the classifier), 67.22% higher fidelity (evaluated on the approximator), and 15.87x higher robustness than that of the three existing interpreters we evaluated. Moreover, the interpreter is 47.31% less affected by out-of-distribution examples than that of LEMNA.

References

[1]

J. Zhang, X. Wang, H. Zhang, H. Sun, K. Wang, and X. Liu, "A novel neural source code representation based on abstract syntax tree," in Proceedings of the 41st International Conference on Software Engineering (ICSE), QC, Canada. IEEE, 2019, pp. 783--794.

[2]

L. Mou, G. Li, L. Zhang, T. Wang, and Z. Jin, "Convolutional neural networks over tree structures for programming language processing," in Proceedings of the 30th AAAI Conference on Artificial Intelligence (AAAI), Phoenix, Arizona, USA. AAAI Press, 2016, pp. 1287--1293.

[3]

A. Caliskan-Islam, R. Harang, A. Liu, A. Narayanan, C. Voss, F. Yamaguchi, and R. Greenstadt, "De-anonymizing programmers via code stylometry," in Proceedings of the 24th USENIX Security Symposium (USENIX Security), Washington, D.C., USA, 2015, pp. 255--270.

[4]

B. Alsulami, E. Dauber, R. Harang, S. Mancoridis, and R. Greenstadt, "Source code authorship attribution using long short-term memory based networks," in Proceedings of the 22nd European Symposium on Research in Computer Security (ESORICS), Oslo, Norway, 2017, pp. 65--82.

[5]

X. Yang, G. Xu, Q. Li, Y. Guo, and M. Zhang, "Authorship attribution of source code by using back propagation neural network based on particle swarm optimization," PloS one, vol. 12, no. 11, p. e0187204, 2017.

[6]

E. Bogomolov, V. Kovalenko, Y. Rebryk, A. Bacchelli, and T. Bryksin, "Authorship attribution of source code: A language-agnostic approach and applicability in software engineering," in Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE), Athens, Greece, 2021, pp. 932--944.

[7]

M. Abuhamad, T. AbuHmed, A. Mohaisen, and D. Nyang, "Large-scale and language-oblivious code authorship identification," in Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security (CCS), Toronto, ON, Canada, 2018, pp. 101--114.

[8]

G. Lin, J. Zhang, W. Luo, L. Pan, and Y. Xiang, "Poster: Vulnerability discovery with function representation learning from unlabeled projects," in Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security (CCS), Dallas, TX, USA, 2017, pp. 2539--2541.

[9]

Z. Li, D. Zou, S. Xu, X. Ou, H. Jin, S. Wang, Z. Deng, and Y. Zhong, "VulDeePecker: A deep learning-based system for vulnerability detection," in Proceedings of the 25th Annual Network and Distributed System Security Symposium (NDSS), San Diego, California, USA, 2018, pp. 1--15.

[10]

Z. Li, D. Zou, S. Xu, H. Jin, Y. Zhu, and Z. Chen, "SySeVR: A framework for using deep learning to detect software vulnerabilities," IEEE Transactions on Dependable and Secure Computing, vol. 19, no. 4, pp. 2244--2258, 2022.

[11]

A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, "Attention is all you need," in Proceedings of Annual Conference on Neural Information Processing Systems (NeurIPS), Long Beach, CA, USA, 2017, pp. 5998--6008.

[12]

E. Choi, M. T. Bahadori, J. Sun, J. Kulas, A. Schuetz, and W. Stewart, "Retain: An interpretable predictive model for healthcare using reverse time attention mechanism," in Proceedings of Annual Conference on Neural Information Processing Systems (NeurIPS), Barcelona, Spain, 2016, pp. 3504--3512.

[13]

M. T. Ribeiro, S. Singh, and C. Guestrin, "'Why should I trust you?' Explaining the predictions of any classifier," in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), San Francisco, CA, USA, 2016, pp. 1135--1144.

[14]

N. D. Bui, Y. Yu, and L. Jiang, "Autofocus: Interpreting attention-based neural networks by code perturbation," in Proceedings of the 34th IEEE/ACM International Conference on Automated Software Engineering (ASE), San Diego, CA, USA, 2019, pp. 38--41.

[15]

D. Zou, Y. Hu, W. Li, Y. Wu, H. Zhao, and H. Jin, "mVulPreter: A multi-granularity vulnerability detection system with interpretations," IEEE Transactions on Dependable and Secure Computing, pp. 1--12, 2022.

[16]

D. Zou, Y. Zhu, S. Xu, Z. Li, H. Jin, and H. Ye, "Interpreting deep learning-based vulnerability detector predictions based on heuristic searching," ACM Transactions on Software Engineering and Methodology, vol. 30, no. 2, pp. 1--31, 2021.

Digital Library

[17]

J. Cito, I. Dillig, V. Murali, and S. Chandra, "Counterfactual explanations for models of code," in Proceedings of the 44th IEEE/ACM International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP), Pittsburgh, PA, USA, 2022, pp. 125--134.

[18]

S. Suneja, Y. Zheng, Y. Zhuang, J. A. Laredo, and A. Morari, "Probing model signal-awareness via prediction-preserving input minimization," in Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE), Athens, Greece, 2021, pp. 945--955.

[19]

M. R. I. Rabin, V. J. Hellendoorn, and M. A. Alipour, "Understanding neural code intelligence through program simplification," in Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE), Athens, Greece, 2021, pp. 441--452.

[20]

A. Zeller and R. Hildebrandt, "Simplifying and isolating failure-inducing input," IEEE Transactions on Software Engineering, vol. 28, no. 2, pp. 183--200, 2002.

Digital Library

[21]

S. Hooker, D. Erhan, P.-J. Kindermans, and B. Kim, "A benchmark for interpretability methods in deep neural networks," in Proceedings of Annual Conference on Neural Information Processing Systems (NeurIPS), Vancouver, BC, Canada, 2019, pp. 9734--9745.

[22]

L. Brocki and N. C. Chung, "Evaluation of interpretability methods and perturbation artifacts in deep neural networks," arXiv preprint arXiv:2203.02928, 2022.

[23]

M. Bajaj, L. Chu, Z. Y. Xue, J. Pei, L. Wang, P. C.-H. Lam, and Y. Zhang, "Robust counterfactual explanations on graph neural networks," in Proceedings of Annual Conference on Neural Information Processing Systems (NeurIPS), Virtual Event, 2021, pp. 5644--5655.

[24]

X. Zhang, N. Wang, H. Shen, S. Ji, X. Luo, and T. Wang, "Interpretable deep learning under fire," in Proceedings of the 29th USENIX Security Symposium (USENIX Security), Virtual Event, 2020, pp. 1659--1676.

[25]

W. Guo, D. Mu, J. Xu, P. Su, G. Wang, and X. Xing, "LEMNA: Explaining deep learning based security applications," in Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security (CCS), Toronto, ON, Canada, 2018, pp. 364--379.

[26]

J. Chen, L. Song, M. Wainwright, and M. Jordan, "Learning to explain: An information-theoretic perspective on model interpretation," in Proceedings of the 35th International Conference on Machine Learning (ICML), Stockholmsmässan, Stockholm, Sweden, 2018, pp. 883--892.

[27]

H. Lakkaraju, N. Arsov, and O. Bastani, "Robust and stable black box explanations," in Proceedings of the 37th International Conference on Machine Learning (ICML), Virtual Event, 2020, pp. 5628--5638.

[28]

E. La Malfa, A. Zbrzezny, R. Michelmore, N. Paoletti, and M. Kwiatkowska, "On guaranteed optimal robust explanations for NLP models," in Proceedings of the 30th International Joint Conference on Artificial Intelligence (IJCAI), Virtual Event, 2021, pp. 2658--2665.

[29]

H. Zhang, M. Cisse, Y. N. Dauphin, and D. Lopez-Paz, "mixup: Beyond empirical risk minimization," in Proceedings of the 6th International Conference on Learning Representations (ICLR), Vancouver, BC, Canada, 2018.

[30]

Z. Li, G. Chen, C. Chen, Y. Zou, and S. Xu, "RopGen: Towards robust code authorship attribution via automatic coding style transformation," in Proceedings of the 44th International Conference on Software Engineering (ICSE), Pittsburgh, PA, USA, 2022, pp. 1906--1918.

[31]

M. Levandowsky and D. Winter, "Distance between sets," Nature, vol. 234, no. 5323, pp. 34--35, 1971.

[32]

J. Liang, B. Bai, Y. Cao, K. Bai, and F. Wang, "Adversarial infidelity learning for model interpretation," in Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD), Virtual Event, 2020, pp. 286--296.

[33]

M. B. Zafar, M. Donini, D. Slack, C. Archambeau, S. Das, and K. Kenthapadi, "On the lack of robust interpretability of neural text classifiers," in Proceedings of the Association for Computational Linguistics Findings (ACL/IJCNLP), Virtual Event, 2021, pp. 3730--3740.

[34]

https://codingcompetitions.withgoogle.com/codejam, 2022.

[35]

E. Quiring, A. Maier, and K. Rieck, "Misleading authorship attribution of source code using adversarial learning," in Proceedings of the 28th USENIX Security Symposium, Santa Clara, CA, USA. USENIX Association, 2019, pp. 479--496.

[36]

M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Devin, S. Ghemawat, G. Irving, M. Isard, M. Kudlur, J. Levenberg, R. Monga, S. Moore, D. G. Murray, B. Steiner, P. A. Tucker, V. Vasudevan, P. Warden, M. Wicke, Y. Yu, and X. Zheng, "TensorFlow: A system for large-scale machine learning," in Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI), Savannah, GA, USA. USENIX Association, 2016, pp. 265--283.

[37]

Y. Wang, W. Wang, S. Joty, and S. C. Hoi, "CodeT5: Identifier-aware unified pre-trained encoder-decoder models for code understanding and generation," in Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP), Virtual Event, 2021, pp. 8696--8708.

[38]

Z. Feng, D. Guo, D. Tang, N. Duan, X. Feng, M. Gong, L. Shou, B. Qin, T. Liu, D. Jiang, and M. Zhou, "CodeBERT: A pre-trained model for programming and natural languages," in Proceedings of Findings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Virtual Event, 2020, pp. 1536--1547.

[39]

R. Russell, L. Kim, L. Hamilton, T. Lazovich, J. Harer, O. Ozdemir, P. Ellingwood, and M. McConley, "Automated vulnerability detection in source code using deep representation learning," in Proceedings of the 17th IEEE International Conference on Machine Learning and Applications (ICMLA), Orlando, FL, USA. IEEE, 2018, pp. 757--762.

[40]

U. Alon, M. Zilberstein, O. Levy, and E. Yahav, "code2vec: Learning distributed representations of code," Proc. ACM Program. Lang., vol. 3, no. POPL, pp. 1--29, 2019.

[41]

M. Allamanis, M. Brockschmidt, and M. Khademi, "Learning to represent programs with graphs," in Proceedings of the 6th International Conference on Learning Representations (ICLR), Vancouver, BC, Canada, 2018.

[42]

N. Puri, P. Gupta, P. Agarwal, S. Verma, and B. Krishnamurthy, "MAGIX: Model agnostic globally interpretable explanations," arXiv preprint arXiv:1706.07160, 2017.

[43]

J. Wang, L. Gou, W. Zhang, H. Yang, and H.-W. Shen, "DeepVID: Deep visual interpretation and diagnosis for image classifiers via knowledge distillation," IEEE Transactions on Visualization and Computer Graphics, vol. 25, no. 6, pp. 2168--2180, 2019.

[44]

K. Simonyan, A. Vedaldi, and A. Zisserman, "Deep inside convolutional networks: Visualising image classification models and saliency maps," in Proceedings of the 2nd International Conference on Learning Representations (ICLR), Banff, AB, Canada, 2014.

[45]

S. M. Lundberg and S.-I. Lee, "A unified approach to interpreting model predictions," in Proceedings of Annual Conference on Neural Information Processing Systems (NeurIPS), Long Beach, CA, USA, 2017, pp. 4765--4774.

[46]

P. Schwab and W. Karlen, "CXPlain: Causal explanations for model interpretation under uncertainty," in Proceedings of Annual Conference on Neural Information Processing Systems (NeurIPS), Vancouver, BC, Canada, 2019, pp. 10220--10230.

[47]

D. Smilkov, N. Thorat, B. Kim, F. Viégas, and M. Wattenberg, "SmoothGrad: Removing noise by adding noise," arXiv preprint arXiv:1706.03825, 2017.

[48]

L. Rieger and L. K. Hansen, "A simple defense against adversarial attacks on heatmap explanations," in Proceedings of the 2020 Workshop on Human Interpretability in Machine Learning (WHI), Virtual Event, 2020.

[49]

Z. Wang, H. Wang, S. Ramkumar, P. Mardziel, M. Fredrikson, and A. Datta, "Smoothed geometry for robust attribution," in Proceedings of Annual Conference on Neural Information Processing Systems (NeurIPS), Virtual Event, 2020, pp. 13 623--13 634.

[50]

X. Zhao, W. Huang, X. Huang, V. Robu, and D. Flynn, "BayLIME: Bayesian local interpretable model-agnostic explanations," in Proceedings of the 37th Conference on Uncertainty in Artificial Intelligence (UAI), Virtual Event, 2021, pp. 887--896.

[51]

Z. Zhou, G. Hooker, and F. Wang, "S-LIME: Stabilized-lime for model explanation," in Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining (KDD), Virtual Event, 2021, pp. 2429--2438.

Cited By

Cao SSun XWu XLo DBo LLi BLiu XLin XLiu WFilkov VRay BZhou M(2024)Snopy: Bridging Sample Denoising with Causal Graph Learning for Effective Vulnerability DetectionProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695057(606-618)Online publication date: 27-Oct-2024
https://dl.acm.org/doi/10.1145/3691620.3695057
Chen ZJiang LFilkov VRay BZhou M(2024)Promise and Peril of Collaborative Code Generation Models: Balancing Effectiveness and MemorizationProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695021(493-505)Online publication date: 27-Oct-2024
https://dl.acm.org/doi/10.1145/3691620.3695021

Index Terms

Robin: A Novel Method to Produce Robust Interpreters for Deep Learning-Based Code Classifiers

Index terms have been assigned to the content through auto-classification.

Recommendations

Sensitivity based robust learning for stacked autoencoder against evasion attack

Although deep learning has achieved excellent performance in many applications, some studies indicate that deep learning algorithms are vulnerable in an adversarial environment. A small distortion on a sample leads to misclassification easily. Until now,...
Ensemble of deep features and classifiers approach for MRI brain tumour classification

Medical professionals identify and classify brain tumours to save lives. This innovative study applies prominent machine learning classifiers to varied deep brain imaging features extracted by a pre-trained convolution neural network. Several machine ...
Learning ensemble classifiers via restricted Boltzmann machines

Recently, restricted Boltzmann machines (RBMs) have attracted considerable interest in machine learning field due to their strong ability to extract features. Given some training data, an RBM or a stack of several RBMs can be used to extract informative ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

ASE '23: Proceedings of the 38th IEEE/ACM International Conference on Automated Software Engineering

November 2023

2161 pages

ISBN:9798350329964

General Chairs:
Tegawendé F. Bissyandé
University of Luxembourg, Luxembourg
,
Jacques Klein
University of Luxembourg, Luxembourg
,
Program Chairs:
Christian Bird
Microsoft Research, United States
,
Federica Sarro
University College London, United Kingdom

Sponsors

In-Cooperation

University of Luxembourg: University of Luxembourg
IEEE CS

Publisher

IEEE Press

Publication History

Published: 26 September 2024

Check for updates

Author Tags

Qualifiers

Research-article

Conference

ASE '23

Sponsor:

ASE '23: 38th IEEE/ACM International Conference on Automated Software Engineering

November 11 - 15, 2023

Echternach, Luxembourg

Acceptance Rates

Overall Acceptance Rate 82 of 337 submissions, 24%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
13
Total Downloads

Downloads (Last 12 months)13
Downloads (Last 6 weeks)2

Reflects downloads up to 13 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Cao SSun XWu XLo DBo LLi BLiu XLin XLiu WFilkov VRay BZhou M(2024)Snopy: Bridging Sample Denoising with Causal Graph Learning for Effective Vulnerability DetectionProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695057(606-618)Online publication date: 27-Oct-2024
https://dl.acm.org/doi/10.1145/3691620.3695057
Chen ZJiang LFilkov VRay BZhou M(2024)Promise and Peril of Collaborative Code Generation Models: Balancing Effectiveness and MemorizationProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695021(493-505)Online publication date: 27-Oct-2024
https://dl.acm.org/doi/10.1145/3691620.3695021

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten