Abstract
Due to the prevalent application of machine learning (ML) techniques and the intrinsic black-box nature of ML models, the need for good explanations that are sufficient and necessary towards locally interpreting a model’s prediction has been well recognized and emphasized. Existing explanation approaches, however, favor either the sufficiency or necessity. To fill this gap, in this paper, we propose an approach for generating local explanations that are both sufficient and necessary. Our approach, DDImage, automatically produces local explanations for ML-based image classifiers in a post-hoc way. The core idea behind DDImage is to discover an appropriate explanation by debugging the given input image via a series of image reductions, with respect to the sufficiency and necessity properties. Evaluation of DDImage using publicly available datasets and popular classification models reveals its effectiveness and efficiency. Compared with three state-of-the-art approaches, DDImage demonstrates a superior performance in producing small-sized explanations preserving both sufficiency and necessity, and it also shows promising stability and efficiency. We also identify the impact of segmentation granularity, reveal the performance variance for different target models, and further show that our approach is applicable across different problem domains.
Similar content being viewed by others
Data Availability
The source code of our implementation and the results of experiments are publicly available in the following GitHub repository: https://github.com/ymxl85/DDImage.
Notes
References
Achanta R, Shaji A, Smith K, Lucchi A, Fua P, Süsstrunk S (2012) Slic superpixels compared to state-of-the-art superpixel methods. IEEE Trans Pattern Anal Mach Intell 34(11):2274–2282
Balkir E, Nejadgholi I, Fraser K, Kiritchenko S (2022) Necessity and sufficiency for explaining text classifiers: A case study in hate speech detection. In: Proceedings of the 2022 conference of the north american chapter of the association for computational linguistics: human language technologies, pp 2672–2686
Burger M, Zeller A (2011) Minimizing reproduction of software failures. In: Proceedings of the 2011 international symposium on software testing and analysis, pp 221–231
Burkart N, Huber MF (2021) A survey on the explainability of supervised machine learning. J Artif Intell Res 70:245–317
Carvalho DV, Pereira EM, Cardoso JS (2019) Machine learning interpretability: A survey on methods and metrics. Electronics 8(8):832
Chandrasekaran J, Lei Y, Kacker R, Kuhn DR (2021) A combinatorial approach to explaining image classifiers. In: Proceedings of the 2021 IEEE international conference on software testing, verification and validation workshops (ICSTW), pp 35–43
Christi A, Olson ML, Alipour MA, Groce A (2018) Reduce before you localize: Delta-debugging and spectrum-based fault localization. In: 2018 IEEE International symposium on software reliability engineering workshops (ISSREW), pp 184–191
Cito J, Dillig I, Murali V, Chandra S (2022) Counterfactual explanations for models of code. In: Proceedings of the 44th international conference on software engineering: software engineering in practice, pp 125–134
Clapp L, Bastani O, Anand S, Aiken A (2016) Minimizing GUI event traces. In: Proceedings of the 2016 24th ACM SIGSOFT international symposium on foundations of software engineering, pp 422–434
Field A (2013) Discovering statistics using IBM SPSS statistics. sage
Galhotra S, Pradhan R, Salimi B (2021) Explaining black-box algorithms using probabilistic contrastive counterfactuals. In: Proceedings of the 2021 international conference on management of data, pp 577–590
Gilpin LH, Bau D, Yuan BZ, Bajwa A, Specter M, Kagal L (2018) Explaining explanations: An overview of interpretability of machine learning. https://doi.org/10.48550/ARXIV.1806.00069
Goyal Y, Wu Z, Ernst J, Batra D, Parikh D, Lee S (2019) Counterfactual visual explanations. In: Proceedings of the international conference on machine learning, pp 2376–2384
Guidotti R (2022) Counterfactual explanations and how to find them: literature review and benchmarking. Data mining and knowledge discovery pp 1–55
Guidotti R, Monreale A, Ruggieri S, Turini F, Giannotti F, Pedreschi D (2018) A survey of methods for explaining black box models. ACM Comput Surv (CSUR) 51(5):1–42
Hammoudi M, Burg B, Bae G, Rothermel G (2015) On the use of delta debugging to reduce recordings and facilitate debugging of web applications. In: Proceedings of the 2015 10th joint meeting on foundations of software engineering, pp 333–344
He K, Zhang X, Ren S, Sun J (2016) Identity mappings in deep residual networks. In: Proceedings of the European conference on computer vision, pp 630–645
Hinton G, Vinyals O, Dean J (2015) Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531
Hodován R, Kiss Á, Gyimóthy T (2017) Coarse hierarchical delta debugging. In: Proceedings of the 2017 IEEE international conference on software maintenance and evolution (ICSME), pp 194–203
Idahl M, Lyu L, Gadiraju U, Anand A (2021) Towards benchmarking the utility of explanations for model debugging. arXiv preprint arXiv:2105.04505
Jiang M, Tang C, Zhang XY, Zhao Y, Ding Z (2023) Automated image reduction for explaining black-box classifiers. In: 2023 IEEE International conference on software analysis, evolution and reengineering (SANER), IEEE, pp 367–378
Kim B, Khanna R, Koyejo OO (2016) Examples are not enough, learn to criticize! criticism for interpretability. Advances in neural information processing systems 29:2288–2296
Kirschner L, Soremekun E, Zeller A (2020) Debugging inputs. In: 2020 IEEE/ACM 42nd International conference on software engineering (ICSE), IEEE, pp 75–86
Koh PW, Liang P (2017) Understanding black-box predictions via influence functions. In: International conference on machine learning, PMLR, pp 1885–1894
Kolesnikov A, Beyer L, Zhai X, Puigcerver J, Yung J, Gelly S, Houlsby N (2020) Big transfer (bit): General visual representation learning. In: Computer Vision – ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part V, pp 491–507
Kommiya Mothilal R, Mahajan D, Tan C, Sharma A (2021) Towards unifying feature attribution and counterfactual explanations: Different means to the same end. In: Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society, pp 652–663
Kumar A (2022) The illustrated image captioning using transformers. ankur3107githubio. https://ankur3107.github.io/blogs/the-illustrated-image-captioning-using-transformers/
Lertvittayakumjorn P, Toni F (2021) Explanation-based human debugging of nlp models: A survey. Transactions of the Association for Computational Linguistics 9:1508–1528
Lin TY, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft coco: Common objects in context. In: Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13, Springer, pp 740–755
Lundberg SM, Lee SI (2017) A unified approach to interpreting model predictions. In: Proceedings of the 31st international conference on neural information processing systems, pp 4768–4777
Martens D, Provost F (2014) Explaining data-driven document classifications. MIS quarterly 38(1):73–100
Mathew B, Saha P, Yimam SM, Biemann C, Goyal P, Mukherjee A (2021) Hatexplain: A benchmark dataset for explainable hate speech detection. In: Proceedings of the AAAI conference on artificial intelligence, pp 14867–14875
Misherghi G, Su Z (2006) Hdd: hierarchical delta debugging. In: Proceedings of the 28th international conference on Software engineering, pp 142–151
Moraffah R, Karami M, Guo R, Raglin A, Liu H (2020) Causal interpretability for machine learning-problems, methods and evaluation. ACM SIGKDD Explor Newsl 22(1):18–33
Nguyen A, Yosinski J, Clune J (2015) Deep neural networks are easily fooled: High confidence predictions for unrecognizable images. In: 2015 IEEE Conference on computer vision and pattern recognition (CVPR), pp 427–436
Nguyen HV, Bai L (2010) Cosine similarity metric learning for face verification. In: Proceedings of the asian conference on computer vision, pp 709–720
Oh Y, Jung H, Park J, Kim MS (2021) Evet: enhancing visual explanations of deep neural networks using image transformations. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp 3579–3587
Olah C, Satyanarayan A, Johnson I, Carter S, Schubert L, Ye K, Mordvintsev A (2018) The building blocks of interpretability. Distill 3(3):e10
Papernot N, McDaniel P, Jha S, Fredrikson M, Celik ZB, Swami A (2016) The limitations of deep learning in adversarial settings. In: 2016 IEEE European symposium on security and privacy (EuroS &P), pp 372–387
Pouyanfar S, Sadiq S, Yan Y, Tian H, Tao Y, Reyes MP, Shyu ML, Chen SC, Iyengar SS (2018) A survey on deep learning: Algorithms, techniques, and applications. ACM Comput Surv 51(5):1–36
Qiu L, Yang Y, Cao CC, Zheng Y, Ngai H, Hsiao J, Chen L (2022) Generating perturbation-based explanations with robustness to out-of-distribution data. In: Proceedings of the ACM web conference, pp 3594–3605
Rabin MRI, Hellendoorn VJ, Alipour MA (2021) Understanding neural code intelligence through program simplification. In: Proceedings of the 29th ACM joint meeting on european software engineering conference and symposium on the foundations of software engineering, pp 441–452
Ribeiro MT, Singh S, Guestrin C (2016) Why should I trust you? Explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp 1135–1144
Sandler M, Howard A, Zhu M, Zhmoginov A, Chen LC (2018) Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4510–4520
Simonyan K, Vedaldi A, Zisserman A (2014) Visualising image classification models and saliency maps. Deep Inside Convolutional Networks 2:2
Situ X, Zukerman I, Paris C, Maruf S, Haffari G (2021) Learning to explain: Generating stable explanations fast. In: Proceedings of the 59th Annual meeting of the association for computational linguistics and the 11th international joint conference on natural language processing, pp 5340–5355
Sun Y, Chockler H, Huang X, Kroening D (2020) Explaining image classifiers using statistical fault localization. In: European conference on computer vision, pp 391–406
Suneja S, Zheng Y, Zhuang Y, Laredo JA, Morari A (2021) Probing model signal-awareness via prediction-preserving input minimization. In: Proceedings of the 29th ACM joint meeting on european software engineering conference and symposium on the foundations of software engineering, pp 945–955
Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2818–2826
Tuan YL, Pryor C, Chen W, Getoor L, Wang WY (2021) Local explanation of dialogue response generation. Adv Neural Inf Process Syst 34:404–416
Vargha A, Delaney HD (2000) A critique and improvement of the cl common language effect size statistics of mcgraw and wong. J Educ Behav Stat 25(2):101–132
Vedaldi A, Soatto S (2008) Quick shift and kernel methods for mode seeking. In: Proceedings of the European conference on computer vision, Springer, pp 705–718
Vermeire T, Brughmans D, Goethals S, de Oliveira RMB, Martens D (2022) Explainable image classification with evidence counterfactual. Pattern Anal App 1–21
Wang G, Shen R, Chen J, Xiong Y, Zhang L (2021a) Probabilistic delta debugging. In: Proceedings of the 29th ACM joint meeting on european software engineering conference and symposium on the foundations of software engineering, pp 881–892
Wang Y, Wang K, Wang L (2021b) Wheacha: A method for explaining the predictions of models of code. arXiv preprint arXiv:2102.04625
Wickramanayake S, Hsu W, Lee ML (2021) Explanation-based data augmentation for image classification. Adv Neural Inf Process Syst 34:20929–20940
Yuan Y, Pang Q, Wang S (2022) Unveiling hidden dnn defects with decision-based metamorphic testing. In: Proceedings of the 37th IEEE/ACM international conference on automated software engineering, pp 1–13
Zeller A (2002) Isolating cause-effect chains from computer programs. In: Proceedings of the 10th ACM SIGSOFT symposium on foundations of software engineering, pp 1–10
Zeller A, Hildebrandt R (2002) Simplifying and isolating failure-inducing input. IEEE Trans Softw Eng 28(2):183–200
Zhang Y, Tiňo P, Leonardis A, Tang K (2021) A survey on neural network interpretability. IEEE Trans Emerg Topics Comput Intell 5(5):726–742
Zhao X, Huang W, Huang X, Robu V, Flynn D (2021) Baylime: Bayesian local interpretable model-agnostic explanations. In: Proceedings of the Thirty-seventh conference on uncertainty in artificial intelligence, pp 887–896
Zhou X, Peng X, Xie T, Sun J, Li W, Ji C, Ding D (2018) Delta debugging microservice systems. In: Proceedings of the 33rd ACM/IEEE international conference on automated software engineering, pp 802–807
Zohdinasab T, Riccio V, Tonella P (2023) An empirical study on low-and high-level explanations of deep learning misbehaviours. In: 2023 ACM/IEEE International symposium on empirical software engineering and measurement (ESEM), IEEE, pp 1–11
Acknowledgements
We first sincerely appreciate the positive acknowledgment and the valuable suggestions from the anonymous reviewers for our conference paper. This work was supported by the National Nature Science Foundation of China (Grant No.61802349, No. 62302035, No. 62132014 and No. 61972359), the Zhejiang Provincial Natural Science Foundation of China (Grant No. LY20F020021), and the Zhejiang Provincial Key Research and Development Program of China (No.2022C01045).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflicts of interest
The authors declared that they have no conflict of interest.
Additional information
Communicated by: Nicole Novielli, Xin Xia, and Tao Zhang
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This article belongs to the Topical Collection: Software Analysis, Evolution and Reengineering (SANER)
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Jiang, M., Tang, C., Zhang, XY. et al. DDImage: an image reduction based approach for automatically explaining black-box classifiers. Empir Software Eng 29, 129 (2024). https://doi.org/10.1007/s10664-024-10505-0
Accepted:
Published:
DOI: https://doi.org/10.1007/s10664-024-10505-0