Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1109/ASE56229.2023.00164acmconferencesArticle/Chapter ViewAbstractPublication PagesaseConference Proceedingsconference-collections
research-article

Robin: A Novel Method to Produce Robust Interpreters for Deep Learning-Based Code Classifiers

Published: 26 September 2024 Publication History

Abstract

Deep learning has been widely used in source code classification tasks, such as code classification according to their functionalities, code authorship attribution, and vulnerability detection. Unfortunately, the black-box nature of deep learning makes it hard to interpret and understand why a classifier (i.e., classification model) makes a particular prediction on a given example. This lack of interpretability (or explainability) might have hindered their adoption by practitioners because it is not clear when they should or should not trust a classifier's prediction. The lack of interpretability has motivated a number of studies in recent years. However, existing methods are neither robust nor able to cope with out-of-distribution examples. In this paper, we propose a novel method to produce Robust interpreters for a given deep learning-based code classifier; the method is dubbed Robin. The key idea behind Robin is a novel hybrid structure combining an interpreter and two approximators, while leveraging the ideas of adversarial training and data augmentation. Experimental results show that on average the interpreter produced by Robin achieves a 6.11% higher fidelity (evaluated on the classifier), 67.22% higher fidelity (evaluated on the approximator), and 15.87x higher robustness than that of the three existing interpreters we evaluated. Moreover, the interpreter is 47.31% less affected by out-of-distribution examples than that of LEMNA.

References

[1]
J. Zhang, X. Wang, H. Zhang, H. Sun, K. Wang, and X. Liu, "A novel neural source code representation based on abstract syntax tree," in Proceedings of the 41st International Conference on Software Engineering (ICSE), QC, Canada. IEEE, 2019, pp. 783--794.
[2]
L. Mou, G. Li, L. Zhang, T. Wang, and Z. Jin, "Convolutional neural networks over tree structures for programming language processing," in Proceedings of the 30th AAAI Conference on Artificial Intelligence (AAAI), Phoenix, Arizona, USA. AAAI Press, 2016, pp. 1287--1293.
[3]
A. Caliskan-Islam, R. Harang, A. Liu, A. Narayanan, C. Voss, F. Yamaguchi, and R. Greenstadt, "De-anonymizing programmers via code stylometry," in Proceedings of the 24th USENIX Security Symposium (USENIX Security), Washington, D.C., USA, 2015, pp. 255--270.
[4]
B. Alsulami, E. Dauber, R. Harang, S. Mancoridis, and R. Greenstadt, "Source code authorship attribution using long short-term memory based networks," in Proceedings of the 22nd European Symposium on Research in Computer Security (ESORICS), Oslo, Norway, 2017, pp. 65--82.
[5]
X. Yang, G. Xu, Q. Li, Y. Guo, and M. Zhang, "Authorship attribution of source code by using back propagation neural network based on particle swarm optimization," PloS one, vol. 12, no. 11, p. e0187204, 2017.
[6]
E. Bogomolov, V. Kovalenko, Y. Rebryk, A. Bacchelli, and T. Bryksin, "Authorship attribution of source code: A language-agnostic approach and applicability in software engineering," in Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE), Athens, Greece, 2021, pp. 932--944.
[7]
M. Abuhamad, T. AbuHmed, A. Mohaisen, and D. Nyang, "Large-scale and language-oblivious code authorship identification," in Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security (CCS), Toronto, ON, Canada, 2018, pp. 101--114.
[8]
G. Lin, J. Zhang, W. Luo, L. Pan, and Y. Xiang, "Poster: Vulnerability discovery with function representation learning from unlabeled projects," in Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security (CCS), Dallas, TX, USA, 2017, pp. 2539--2541.
[9]
Z. Li, D. Zou, S. Xu, X. Ou, H. Jin, S. Wang, Z. Deng, and Y. Zhong, "VulDeePecker: A deep learning-based system for vulnerability detection," in Proceedings of the 25th Annual Network and Distributed System Security Symposium (NDSS), San Diego, California, USA, 2018, pp. 1--15.
[10]
Z. Li, D. Zou, S. Xu, H. Jin, Y. Zhu, and Z. Chen, "SySeVR: A framework for using deep learning to detect software vulnerabilities," IEEE Transactions on Dependable and Secure Computing, vol. 19, no. 4, pp. 2244--2258, 2022.
[11]
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, "Attention is all you need," in Proceedings of Annual Conference on Neural Information Processing Systems (NeurIPS), Long Beach, CA, USA, 2017, pp. 5998--6008.
[12]
E. Choi, M. T. Bahadori, J. Sun, J. Kulas, A. Schuetz, and W. Stewart, "Retain: An interpretable predictive model for healthcare using reverse time attention mechanism," in Proceedings of Annual Conference on Neural Information Processing Systems (NeurIPS), Barcelona, Spain, 2016, pp. 3504--3512.
[13]
M. T. Ribeiro, S. Singh, and C. Guestrin, "'Why should I trust you?' Explaining the predictions of any classifier," in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), San Francisco, CA, USA, 2016, pp. 1135--1144.
[14]
N. D. Bui, Y. Yu, and L. Jiang, "Autofocus: Interpreting attention-based neural networks by code perturbation," in Proceedings of the 34th IEEE/ACM International Conference on Automated Software Engineering (ASE), San Diego, CA, USA, 2019, pp. 38--41.
[15]
D. Zou, Y. Hu, W. Li, Y. Wu, H. Zhao, and H. Jin, "mVulPreter: A multi-granularity vulnerability detection system with interpretations," IEEE Transactions on Dependable and Secure Computing, pp. 1--12, 2022.
[16]
D. Zou, Y. Zhu, S. Xu, Z. Li, H. Jin, and H. Ye, "Interpreting deep learning-based vulnerability detector predictions based on heuristic searching," ACM Transactions on Software Engineering and Methodology, vol. 30, no. 2, pp. 1--31, 2021.
[17]
J. Cito, I. Dillig, V. Murali, and S. Chandra, "Counterfactual explanations for models of code," in Proceedings of the 44th IEEE/ACM International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP), Pittsburgh, PA, USA, 2022, pp. 125--134.
[18]
S. Suneja, Y. Zheng, Y. Zhuang, J. A. Laredo, and A. Morari, "Probing model signal-awareness via prediction-preserving input minimization," in Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE), Athens, Greece, 2021, pp. 945--955.
[19]
M. R. I. Rabin, V. J. Hellendoorn, and M. A. Alipour, "Understanding neural code intelligence through program simplification," in Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE), Athens, Greece, 2021, pp. 441--452.
[20]
A. Zeller and R. Hildebrandt, "Simplifying and isolating failure-inducing input," IEEE Transactions on Software Engineering, vol. 28, no. 2, pp. 183--200, 2002.
[21]
S. Hooker, D. Erhan, P.-J. Kindermans, and B. Kim, "A benchmark for interpretability methods in deep neural networks," in Proceedings of Annual Conference on Neural Information Processing Systems (NeurIPS), Vancouver, BC, Canada, 2019, pp. 9734--9745.
[22]
L. Brocki and N. C. Chung, "Evaluation of interpretability methods and perturbation artifacts in deep neural networks," arXiv preprint arXiv:2203.02928, 2022.
[23]
M. Bajaj, L. Chu, Z. Y. Xue, J. Pei, L. Wang, P. C.-H. Lam, and Y. Zhang, "Robust counterfactual explanations on graph neural networks," in Proceedings of Annual Conference on Neural Information Processing Systems (NeurIPS), Virtual Event, 2021, pp. 5644--5655.
[24]
X. Zhang, N. Wang, H. Shen, S. Ji, X. Luo, and T. Wang, "Interpretable deep learning under fire," in Proceedings of the 29th USENIX Security Symposium (USENIX Security), Virtual Event, 2020, pp. 1659--1676.
[25]
W. Guo, D. Mu, J. Xu, P. Su, G. Wang, and X. Xing, "LEMNA: Explaining deep learning based security applications," in Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security (CCS), Toronto, ON, Canada, 2018, pp. 364--379.
[26]
J. Chen, L. Song, M. Wainwright, and M. Jordan, "Learning to explain: An information-theoretic perspective on model interpretation," in Proceedings of the 35th International Conference on Machine Learning (ICML), Stockholmsmässan, Stockholm, Sweden, 2018, pp. 883--892.
[27]
H. Lakkaraju, N. Arsov, and O. Bastani, "Robust and stable black box explanations," in Proceedings of the 37th International Conference on Machine Learning (ICML), Virtual Event, 2020, pp. 5628--5638.
[28]
E. La Malfa, A. Zbrzezny, R. Michelmore, N. Paoletti, and M. Kwiatkowska, "On guaranteed optimal robust explanations for NLP models," in Proceedings of the 30th International Joint Conference on Artificial Intelligence (IJCAI), Virtual Event, 2021, pp. 2658--2665.
[29]
H. Zhang, M. Cisse, Y. N. Dauphin, and D. Lopez-Paz, "mixup: Beyond empirical risk minimization," in Proceedings of the 6th International Conference on Learning Representations (ICLR), Vancouver, BC, Canada, 2018.
[30]
Z. Li, G. Chen, C. Chen, Y. Zou, and S. Xu, "RopGen: Towards robust code authorship attribution via automatic coding style transformation," in Proceedings of the 44th International Conference on Software Engineering (ICSE), Pittsburgh, PA, USA, 2022, pp. 1906--1918.
[31]
M. Levandowsky and D. Winter, "Distance between sets," Nature, vol. 234, no. 5323, pp. 34--35, 1971.
[32]
J. Liang, B. Bai, Y. Cao, K. Bai, and F. Wang, "Adversarial infidelity learning for model interpretation," in Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD), Virtual Event, 2020, pp. 286--296.
[33]
M. B. Zafar, M. Donini, D. Slack, C. Archambeau, S. Das, and K. Kenthapadi, "On the lack of robust interpretability of neural text classifiers," in Proceedings of the Association for Computational Linguistics Findings (ACL/IJCNLP), Virtual Event, 2021, pp. 3730--3740.
[34]
https://codingcompetitions.withgoogle.com/codejam, 2022.
[35]
E. Quiring, A. Maier, and K. Rieck, "Misleading authorship attribution of source code using adversarial learning," in Proceedings of the 28th USENIX Security Symposium, Santa Clara, CA, USA. USENIX Association, 2019, pp. 479--496.
[36]
M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Devin, S. Ghemawat, G. Irving, M. Isard, M. Kudlur, J. Levenberg, R. Monga, S. Moore, D. G. Murray, B. Steiner, P. A. Tucker, V. Vasudevan, P. Warden, M. Wicke, Y. Yu, and X. Zheng, "TensorFlow: A system for large-scale machine learning," in Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI), Savannah, GA, USA. USENIX Association, 2016, pp. 265--283.
[37]
Y. Wang, W. Wang, S. Joty, and S. C. Hoi, "CodeT5: Identifier-aware unified pre-trained encoder-decoder models for code understanding and generation," in Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP), Virtual Event, 2021, pp. 8696--8708.
[38]
Z. Feng, D. Guo, D. Tang, N. Duan, X. Feng, M. Gong, L. Shou, B. Qin, T. Liu, D. Jiang, and M. Zhou, "CodeBERT: A pre-trained model for programming and natural languages," in Proceedings of Findings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Virtual Event, 2020, pp. 1536--1547.
[39]
R. Russell, L. Kim, L. Hamilton, T. Lazovich, J. Harer, O. Ozdemir, P. Ellingwood, and M. McConley, "Automated vulnerability detection in source code using deep representation learning," in Proceedings of the 17th IEEE International Conference on Machine Learning and Applications (ICMLA), Orlando, FL, USA. IEEE, 2018, pp. 757--762.
[40]
U. Alon, M. Zilberstein, O. Levy, and E. Yahav, "code2vec: Learning distributed representations of code," Proc. ACM Program. Lang., vol. 3, no. POPL, pp. 1--29, 2019.
[41]
M. Allamanis, M. Brockschmidt, and M. Khademi, "Learning to represent programs with graphs," in Proceedings of the 6th International Conference on Learning Representations (ICLR), Vancouver, BC, Canada, 2018.
[42]
N. Puri, P. Gupta, P. Agarwal, S. Verma, and B. Krishnamurthy, "MAGIX: Model agnostic globally interpretable explanations," arXiv preprint arXiv:1706.07160, 2017.
[43]
J. Wang, L. Gou, W. Zhang, H. Yang, and H.-W. Shen, "DeepVID: Deep visual interpretation and diagnosis for image classifiers via knowledge distillation," IEEE Transactions on Visualization and Computer Graphics, vol. 25, no. 6, pp. 2168--2180, 2019.
[44]
K. Simonyan, A. Vedaldi, and A. Zisserman, "Deep inside convolutional networks: Visualising image classification models and saliency maps," in Proceedings of the 2nd International Conference on Learning Representations (ICLR), Banff, AB, Canada, 2014.
[45]
S. M. Lundberg and S.-I. Lee, "A unified approach to interpreting model predictions," in Proceedings of Annual Conference on Neural Information Processing Systems (NeurIPS), Long Beach, CA, USA, 2017, pp. 4765--4774.
[46]
P. Schwab and W. Karlen, "CXPlain: Causal explanations for model interpretation under uncertainty," in Proceedings of Annual Conference on Neural Information Processing Systems (NeurIPS), Vancouver, BC, Canada, 2019, pp. 10220--10230.
[47]
D. Smilkov, N. Thorat, B. Kim, F. Viégas, and M. Wattenberg, "SmoothGrad: Removing noise by adding noise," arXiv preprint arXiv:1706.03825, 2017.
[48]
L. Rieger and L. K. Hansen, "A simple defense against adversarial attacks on heatmap explanations," in Proceedings of the 2020 Workshop on Human Interpretability in Machine Learning (WHI), Virtual Event, 2020.
[49]
Z. Wang, H. Wang, S. Ramkumar, P. Mardziel, M. Fredrikson, and A. Datta, "Smoothed geometry for robust attribution," in Proceedings of Annual Conference on Neural Information Processing Systems (NeurIPS), Virtual Event, 2020, pp. 13 623--13 634.
[50]
X. Zhao, W. Huang, X. Huang, V. Robu, and D. Flynn, "BayLIME: Bayesian local interpretable model-agnostic explanations," in Proceedings of the 37th Conference on Uncertainty in Artificial Intelligence (UAI), Virtual Event, 2021, pp. 887--896.
[51]
Z. Zhou, G. Hooker, and F. Wang, "S-LIME: Stabilized-lime for model explanation," in Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining (KDD), Virtual Event, 2021, pp. 2429--2438.

Cited By

View all
  • (2024)Snopy: Bridging Sample Denoising with Causal Graph Learning for Effective Vulnerability DetectionProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695057(606-618)Online publication date: 27-Oct-2024
  • (2024)Promise and Peril of Collaborative Code Generation Models: Balancing Effectiveness and MemorizationProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695021(493-505)Online publication date: 27-Oct-2024

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ASE '23: Proceedings of the 38th IEEE/ACM International Conference on Automated Software Engineering
November 2023
2161 pages
ISBN:9798350329964

Sponsors

In-Cooperation

  • University of Luxembourg: University of Luxembourg
  • IEEE CS

Publisher

IEEE Press

Publication History

Published: 26 September 2024

Check for updates

Author Tags

  1. explainable AI
  2. deep learning
  3. code classification
  4. robustness

Qualifiers

  • Research-article

Conference

ASE '23
Sponsor:

Acceptance Rates

Overall Acceptance Rate 82 of 337 submissions, 24%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)13
  • Downloads (Last 6 weeks)2
Reflects downloads up to 13 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Snopy: Bridging Sample Denoising with Causal Graph Learning for Effective Vulnerability DetectionProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695057(606-618)Online publication date: 27-Oct-2024
  • (2024)Promise and Peril of Collaborative Code Generation Models: Balancing Effectiveness and MemorizationProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695021(493-505)Online publication date: 27-Oct-2024

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media