Abstract
Deep learning algorithms and deep neural networks (DNNs) have become extremely popular due to their high-performance accuracy in complex fields, such as image and text classification, speech understanding, document segmentation, credit scoring, and facial recognition. As a result of the highly nonlinear structure of deep learning algorithms, these networks are hard to interpret; thus, it is not clear how the models reach their conclusions and therefore, they are often considered black-box models. The poor transparency of these models is a major drawback despite their effectiveness. In addition, recent regulations such as the General Data Protection Regulation (GDPR), require that, in many cases, an explanation will be provided whenever the learning model may affect a person’s life. For example, in autonomous vehicle applications, methods for visualizing, explaining, and interpreting deep learning models that analyze driver behavior and the road environment have become standard. Explainable artificial intelligence (XAI) or interpretable machine learning (IML) programs aim to enable a suite of methods and techniques that produce more explainable models while maintaining a high level of output accuracy [1–4]. These programs enable human users to better understand, trust, and manage the emerging generation of artificially intelligent systems [4].
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Erico Tjoa, Cuntai Guan, “A Survey on Explainable Artificial Intelligence (XAI): towards Medical XAI” 2019. [Online] https://arxiv.org/pdf/1907.07374.pdf
Supriyo Chakraborty, Richard Tomsett, Ramya Raghavendra, Daniel Harborne, Moustafa Alzantot, Federico Cerutti, Mani Srivastava, Alun Preece, Simon Julier, Raghuveer M. Rao, Troy D. Kelley, Dave Braines, Murat Sensoyk, Christopher J. Willis, Prudhvi Gurram IBM T. J. Watson Research Center, Crime and Security Research Institute, Cardiff University, UCLA, IBM UK, Army Research Lab, Adelphi, Ozyegin University, BAE Systems AI Labs University College London “Interpretability of Deep Learning Models: A Survey of Results” 2017. [Online] https://orca.cf.ac.uk/101500/1/Interpretability%20of%20Deep%20Learning%20Models%20-%20A%20Survey%20of%20Results.pdf
Amina Adadi, Mohammed Berrada, “Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)” 2018. [Online] https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8466590
Filip Karlo Došilović, Mario Brcic, Nikica Hlupic “Explainable Artificial Intelligence: A Survey” 2018. [Online] https://www.researchgate.net/publication/325398586_Explainable_Artificial_Intelligence_A_Survey
Jakob M. Schoenborn, Klaus-Dieter Althof, “Recent Trends in XAI: A Broad Overview on current Approaches, Methodologies and Interactions”. 2019. [Online] http://gaia.fdi.ucm.es/events/xcbr/papers/XCBR-19_paper_1.pdf
B. Letham, C. Rudin, T. H. McCormick, and D. Madigan, “Interpretable classifiers using rules and Bayesian analysis: Building a better stroke prediction model,” Ann. Appl. Statist., vol. 9, no. 3, pp. 1350–1371, 2015. [Online] https://arxiv.org/pdf/1511.01644.pdf
R. Caruana, Y. Lou, J. Gehrke, P. Koch, M. Sturm, and N. Elhadad, “Intelligible models for healthcare: Predicting pneumonia risk and hospital 30-day readmission,” in Proc. 21th ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining, 2015, pp. 1721–1730. [Online] http://people.dbmi.columbia.edu/noemie/papers/15kdd.pdf
K. Xu et al., “Show, attend and tell: Neural image caption generation with visual attention,” in Proc. Int. Conf. Mach. Learn. (ICML), 2015, pp. 1–10. [Online] https://arxiv.org/pdf/1502.03044.pdf
Z. C. Lipton, “The mythos of model interpretability,” in Proc. ICML Workshop Hum. Interpretability Mach. Learn., 2016, pp. 96–100. [Online] https://arxiv.org/pdf/1606.03490.pdf
C. Yang, A. Rangarajan, and S. Ranka. (2018). “Global model interpretation via recursive partitioning.” [Online] https://arxiv.org/pdf/1802.04253.pdf
M. A. Valenzuela-Escárcega, A. Nagesh, and M. Surdeanu. (2018). “Lightly-supervised representation learning with global interpretability.” [Online] https://arxiv.org/pdf/1805.11545.pdf
A. Nguyen, A. Dosovitskiy, J. Yosinski, T. Brox, and J. Clune, “Synthesizing the preferred inputs for neurons in neural networks via deep generator networks,” in Proc. Adv. Neural Inf. Process. Syst. (NIPS), 2016, pp. 3387–3395. [Online] https://arxiv.org/pdf/1605.09304.pdf
M. T. Ribeiro, S. Singh, and C. Guestrin, “‘Why should i trust you?’: Explaining the predictions of any classifier,” in Proc. 22nd ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining, 2016, pp. 1135–1144. [Online] https://www.kdd.org/kdd2016/papers/files/rfp0573-ribeiroA.pdf
M. T. Ribeiro, S. Singh, and C. Guestrin, “Anchors: High-precision model-agnostic explanations,” in Proc. AAAI Conf. Artif. Intell., 2018, pp. 1–9. [Online] https://homes.cs.washington.edu/~marcotcr/aaai18.pdf
K. Simonyan, A. Vedaldi, and A. Zisserman. (2013). “Deep inside convolutional networks: Visualising image classification models and saliency maps. [Online] https://arxiv.org/pdf/1312.6034.pdf
M. D. Zeiler and R. Fergus, “Visualizing and understanding convolutional networks,” in Proc. Eur. Conf. Comput. Vis. Zurich, Switzerland: Springer, 2014, pp. 818–833. [Online] https://cs.nyu.edu/~fergus/papers/zeilerECCV2014.pdf
B. Zhou, A. Khosla, A. Lapedriza, A. Oliva, and O. Torralba, “Learning deep features for discriminative localization,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., June 2016, pp. 2921–2929. [Online] https://arxiv.org/abs/1512.04150
M. Sundararajan, A. Taly, and Q. Yan. (2017). “Axiomatic attribution for deep networks.” [Online] https://arxiv.org/pdf/1703.01365.pdf
D. Smilkov, N. Thorat, B. Kim, F. Viégas, and M. Wattenberg. (2017). “SmoothGrad: Removing noise by adding noise.” [Online] https://arxiv.org/pdf/1706.03825.pdf
S. M. Lundberg and S. I. Lee, “A unified approach to interpreting model predictions,” in Proc. Adv. Neural Inf. Process. Syst., 2017, pp. 4768–4777. [Online] https://arxiv.org/pdf/1705.07874.pdf
R. Guidotti, A. Monreale, S. Ruggieri, D. Pedreschi, F. Turini, and F. Giannotti. (2018). “Local rule-based explanations of black box decision systems.” [Online] https://arxiv.org/pdf/1805.10820.pdf
D. Linsley, D. Scheibler, S. Eberhardt, and T. Serre. (2018). “Globaland-local attention networks for visual recognition.” [Online] https://arxiv.org/pdf/1805.08819.pdf
O. Bastani, C. Kim, and H. Bastani. (2017). “Interpretability via model extraction. [Online] https://arxiv.org/pdf/1706.09773.pdf
J. J. Thiagarajan, B. Kailkhura, P. Sattigeri, and K. N. Ramamurthy. (2016). “TreeView: Peeking into deep neural networks via feature-space partitioning.” [Online] https://arxiv.org/pdf/1611.07429.pdf
D. P. Green and H. L. Kern, “Modeling heterogeneous treatment effects in large-scale experiments using Bayesian additive regression trees,” in Proc. Annu. Summer Meeting Soc. Political Methodol., 2010, pp. 1–40. [Online] http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.190.3826&rep=rep1&type=pdf
J. Elith, J. Leathwick, and T. Hastie, “A working guide to boosted regression trees,” J. Animal Ecol., vol. 77, no. 4, pp. 802–813, 2008. [Online] https://besjournals.onlinelibrary.wiley.com/doi/full/10.1111/j.1365-2656.2008.01390.x
S. H. Welling, H. H. F. Refsgaard, P. B. Brockhoff, and L. H. Clemmensen. (2016). “Forest floor visualizations of random”. [Online] https://arxiv.org/pdf/1605.09196.pdf
A. Goldstein, A. Kapelner, J. Bleich, and E. Pitkin, “Peeking inside the black box: Visualizing statistical learning with plots of individual conditional expectation,” J. Comput. Graph. Statist., vol. 24, no. 1, pp. 44–65, 2015, [Online] https://www.tandfonline.com/doi/abs/10.1080/10618600.2014.907095
G. Casalicchio, C. Molnar, and B. Bischl. (2018). “Visualizing the feature importance for black box models.” [Online] https://arxiv.org/pdf/1804.06620.pdf
U. Johansson, R. König, and I. Niklasson, “The truth is in there—Rule extraction from opaque models using genetic programming,” in Proc. FLAIRS Conf., 2004, pp. 658–663. [Online] http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.470.4124&rep=rep1&type=pdf
T. Hailesilassie. (2017). “Rule extraction algorithm for deep neural networks: A review.” [Online] https://arxiv.org/abs/1610.05267
P. Sadowski, J. Collado, D. Whiteson, and P. Baldi, “Deep learning, dark knowledge, and dark matter,” in Proc. NIPS Workshop High-Energy Phys. Mach. Learn. (PMLR), vol. 42, 2015, pp. 81–87. [Online] http://proceedings.mlr.press/v42/sado14.pdf
S. Tan, R. Caruana, G. Hooker, and Y. Lou. (2018). “Detecting bias in black-box models using transparent model distillation.” [Online] https://arxiv.org/abs/1710.06169
Z. Che, S. Purushotham, R. Khemani, and Y. Liu. (2015). “Distilling knowledge from deep networks with applications to healthcare domain.” [Online] https://arxiv.org/abs/1512.03542
Y. Zhang and B. Wallace (2016). “A sensitivity analysis of (and practitioners’ Guide to) convolutional neural networks for sentence classification. [Online] https://arxiv.org/abs/1510.03820
P. Cortez and M. J. Embrechts, “Using sensitivity analysis and visualization techniques to open black box data mining models,” Inf. Sci., vol. 225, pp. 1–17, Mar. 2013. [Online] https://core.ac.uk/download/pdf/55616214.pdf
S. Bach, A. Binder, G. Montavon, F. Klauschen, K.-R. Müller, and W. Samek, “On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation,” PLoS ONE, vol. 10, no. 7, p. e0130140, 2015. [Online] https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0130140
A. Fisher, C. Rudin, and F. Dominici. (2018). “Model class reliance: Variable importance measures for any machine learning model class, from the ‘rashomon’ perspective.” [Online] https://arxiv.org/abs/1801.01489
B. Kim, R. Khanna, and O. O. Koyejo, “Examples are not enough, learn to criticize! criticism for interpretability,” in Proc. 29th Conf. Neural Inf. Process. Syst. (NIPS), 2016, pp. 2280–2288 [Online] https://papers.nips.cc/paper/6300-examples-are-not-enough-learn-to-criticize-criticism-for-interpretability.pdf
X. Yuan, P. He, Q. Zhu, and X. Li. (2017). “Adversarial examples: Attacks and defenses for deep learning.” [Online] https://arxiv.org/abs/1712.07107
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Notovich, A., Chalutz-Ben Gal, H., Ben-Gal, I. (2023). Explainable Artificial Intelligence (XAI): Motivation, Terminology, and Taxonomy. In: Rokach, L., Maimon, O., Shmueli, E. (eds) Machine Learning for Data Science Handbook. Springer, Cham. https://doi.org/10.1007/978-3-031-24628-9_41
Download citation
DOI: https://doi.org/10.1007/978-3-031-24628-9_41
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-24627-2
Online ISBN: 978-3-031-24628-9
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)