Nothing Special   »   [go: up one dir, main page]

Skip to main content

Advertisement

Log in

Explainable Neural Networks: Achieving Interpretability in Neural Models

  • Survey article
  • Published:
Archives of Computational Methods in Engineering Aims and scope Submit manuscript

Abstract

Data mining is the most widely used method for discovering knowledge. There are numerous data mining tasks, with classification being the most frequently encountered task in various application domains such as fraud detection, disease diagnosis, text classification, and so on. Many classification techniques, such as Bayesian classifiers, decision trees, genetic algorithms, neural networks (NNs), and so on, are available to help researchers solve problems in a variety of domains. However, NNs are the most frequently used classification approach because they are effective at solving classification problems that cannot be divided into linear and non-linear categories, have high classification accuracy on large datasets, and require minimal processing effort. Despite having good classification performances, NNs have a pitfall associated with them which hinders their applicability in some real-world applications. NNs are black boxes in nature, which means they cannot make transparent decisions that humans can interpret. Because of this limitation, NNs are unsuitable for many applications that require transparency in decision-making as well as high accuracy, such as audit mining or medical diagnosis. The well-known solution to this inherent disadvantage of NNs is to extract explainable decision rules from them. The extracted rules provide a detailed understanding of how NNs work in a human-readable format. Rule extraction is a well-established technique with a plethora of literature on the subject. However, there are very few papers whose primary goal is to survey the existing literature. As a result, the goal of this work is to provide a detailed analysis of the existing literature and to create a framework for existing and new researchers to conduct research in this field. The paper examines the state-of art from the perspective of designing framework of the algorithms, evaluation criteria, and applications.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

References

  1. Bengio Y, Buhmann J, Abu-Mostafa Y, Embrechts M, Zurada J (2000) Special issue on neural networks for data mining and knowledge discovery. IEEE Trans Neural Netw 11(3):545–549

    Google Scholar 

  2. Mitra S (2002) Data mining in soft computing framework: a survey. IEEE Trans Neural Netw 13(1):3–14

    MathSciNet  Google Scholar 

  3. Kesavaraj G, Sukumaran S (2013) A study on classification techniques in data mining. In: 2013 4th international conference on computing, communications and networking technologies, ICCCNT 2013

  4. Nirkhi S (2010) Potential use of artificial neural network in data mining. In: 2010 The 2nd international conference on computer and automation engineering, ICCAE 2010, 2

  5. Zhang GP (2000) Neural networks for classification: a survey. IEEE Trans Syst Man Cybern C 30(4):451–462

    Google Scholar 

  6. Akshay Kumar H, Suresh Y (2016) Multilayer feed forward neural network to predict the speed of wind. In: 2016 international conference on computation system and information technology for sustainable solutions, CSITSS 2016, pp 285–290

  7. Wu H, Zhou Y, Luo Q, Basset MA (2016) Training feedforward neural networks using symbiotic organisms search algorithm. Comput Intell Neurosci 2016:9063065

    Google Scholar 

  8. Jivani K, Ambasana J, Kanani S (2014) A survey on rule extraction approaches based techniques for data classification using neural network. Int J Futur Trends Eng Technol 1(1):4–7

    Google Scholar 

  9. Augasta MG, Kathirvalavakumar T (2012) Reverse engineering the neural networks for rule extraction in classification problems. Neural Process Lett 35:131–150

    Google Scholar 

  10. Andrews R, Diederich J, Tickle AB (1995) Survey and critique of techniques for extracting rules from trained artificial neural networks. Knowl-Based Syst 8(6):373–389

    Google Scholar 

  11. Craven MW, Shavlik JW (1997) Using neural networks for data mining. Future Gener Comput Syst 13:2–3

    Google Scholar 

  12. Chakraborty M, Biswas SK, Purkayastha B (2020) Data mining using neural networks in the form of classification rules: a review. In: 2020 4th international conference on computational intelligence and networks (CINE), IEEE. pp 1–6

  13. McCulloch WS, Pitts W (1943) A logical calculus of the ideas immanent in nervous activity. Bull Math Biophys 5:115–133

    MathSciNet  Google Scholar 

  14. Abiodun OI, Jantan A, Omolara AE, Dada KV, Mohamed NAE, Arshad H (2018) State-of-the-art in artificial neural network applications: A survey. Heliyon 4(11):e00938

    Google Scholar 

  15. Chakraborty RC (2010) Fundamentals of neural networks: AI Course Lecture 37 – 38, Notes, Slides

  16. Artificial neural network. In Wikipedia. http://en.wikipedia.org/wiki/Artificial_neural_network. Accessed 22 May 2020

  17. Jain AK, Mao J, Mohiuddin KM (1996) Artificial neural networks: a tutorial. Computer 29(3):31–44

    Google Scholar 

  18. Model N, Neuron S, Functions, T., Neuron, M., Architectures, N., & Networks, R. (2014). Neuron model and network architectures. Hagan.Okstate.Edu

  19. Sibi P, Allwyn Jones S, Siddarth P (2013) Analysis of different activation functions using back propagation neural networks. J Theoret Appl Inf Technol 47(3):1264–1268

    Google Scholar 

  20. Karim A, Zhou S (2015) X-TREPAN: an extended trepan for comprehensibility and classification accuracy in artificial neural networks. Int J Artif Intell Appl 6(5):69–86

    Google Scholar 

  21. Karlik B, Olgac A (2010) Performance analysis of various activation functions in generalized MLP architectures of neural networks. Int J Artif Intell Expert Syst (IJAE) 1(4):111–122

    Google Scholar 

  22. Backpropagation. In Wikipedia. http://en.wikipedia.org/wiki/Backpropagation. Accessed 22 May 2020

  23. de Villiers J, Barnard E (1993) Backpropagation neural nets with one and two hidden layers. IEEE Trans Neural Netw 4(1):136–141

    Google Scholar 

  24. Gajendran G, Vasanthi K (2019) Multilayer feedforward neural networks with backpropagation model on immunotherapy dataset. AIP Conf Proc 2112(1):020102

    Google Scholar 

  25. Kuldip Vora SY (2014) A survey on backpropagation algorithms for feedforward neural networks. Int J Eng Dev Res (IJEDR) 1(3):193–197

    Google Scholar 

  26. Gallant SI (1988) Connectionist expert systems. Commun ACM 31(2):152–169

    Google Scholar 

  27. Huysmans J, Baesens B, Vanthienen J (2006) Using rule extraction to improve the comprehensibility of predictive models

  28. Sestito S, Dillon T (1992) Automated knowledge acquisition of rules with continuously valued attributes. In: Proceedings of 12th international conference on expert systems and their applications, pp 645–656

  29. Towel G, Shavlik J (1993) The extraction of refined rules from knowledge based neural networks. Mach Learn 13(1):71–101

    Google Scholar 

  30. Setiono R, Liu H (1996) Symbolic representation of neural networks. IEEE Comput 29(3):71–77

    Google Scholar 

  31. Craven M, Shavlik J (1996) Extracting tree-structured representations of trained network. Adv Neural Inf Process Syst 8:24–30

    Google Scholar 

  32. Setiono R (1997) Extracting rules from neural networks by pruning and hidden-unit splitting. Neural Comput 9(1):205–225

    Google Scholar 

  33. Setiono R, Liu H (1997) NeuroLinear: from neural networks to oblique decision rules. Neurocomputing 17(1):1–24

    Google Scholar 

  34. Taha IA, Ghosh J (1999) Symbolic interpretation of artificial neural networks. IEEE Trans Knowl Data Eng 11(3):448–463

    Google Scholar 

  35. Setiono R, Kheng W (2000) FERNN: an algorithm for fast extraction of rules from neural networks. Appl Intell 12(1):15–25

    Google Scholar 

  36. Sato M, Tsukimoto H (2001) Rule extraction from neural networks via decision tree induction. In: International joint conference on neural network, Washington, DC, vol 3, pp 1870–1875

  37. Bologna G (2003) A model for single and multiple knowledge based networks. Artif Intell Med 28(2):141–163

    Google Scholar 

  38. Bologna G (2001) A study on rule extraction from several combined neural networks. Int J Neural Syst 11(3):247–255

    Google Scholar 

  39. Zhou ZH, Jiang Y, Chen SF (2003) Extracting symbolic rules from trained neural network ensembles. Artif Intell Commun 16(1):3–15

    Google Scholar 

  40. Etchells TA, Lisboa PJG (2006) Orthogonal search-based rule extraction (OSRE) for trained neural networks: a practical and efficient approach. IEEE Trans Neural Netw 17(2):374–384

    Google Scholar 

  41. Anbananthen SK, Sainarayanan G, Chekima A, Teo J (2006) Data mining using pruned artificial neural network tree (ANNT). Inf Commun Technol 1:1350–1356

    Google Scholar 

  42. Odajimaa K, Hayashi Y, Tianxia G, Setiono R (2008) Greedy rule generation from discrete data and its use in neural network rule extraction. Neural Netw 21(7):1020–1028

    Google Scholar 

  43. Setiono R, Baesens B, Mues C (2008) Recursive neural network rule extraction for data with mixed attributes. IEEE Trans Neural Netw 19(2):299–307

    Google Scholar 

  44. Iqbal RA (2012) Eclectic rule extraction from Neural Networks using aggregated DTs. In: IEEE, 7th international conference on electrical & computer engineering (ICECE), pp 129–132

  45. Hara A, Hayashi Y (2012) Ensemble neural network rule extraction using Re-RX algorithm. Neural Networks (IJCNN), pp 1–6

  46. Hayashi Y, Sato R, Mitra S (2013) A new approach to three ensemble neural network rule extraction using recursive-rule extraction algorithm. In: Proceedings of the 2013 international joint conference on neural networks, IJCNN, USA

  47. Fortuny EJD, Martens D (2015) Active learning-based pedagogical rule extraction. IEEE Trans Neural Netw Learn Syst 26(11):2664–2677

    MathSciNet  Google Scholar 

  48. Hayashi Y, Nakano S, Fujisawa S (2015) Use of the recursive-rule extraction algorithm with continuous attributes to improve diagnostic accuracy in thyroid disease. Inf Med Unlocked 1:1–8

    Google Scholar 

  49. Hayashi Y (2016) Application of a rule extraction algorithm family based on the Re-RX algorithm to financial credit risk assessment from a Pareto optimal perspective. Oper Res Perspect 3:32–42

    MathSciNet  Google Scholar 

  50. Hayashi Y, Yukita S (2016) Rule extraction using recursive-rule extraction algorithm with J48graft combined with sampling selection techniques for the diagnosis of type 2 diabetes mellitus in the Pima Indian dataset. Inf Med Unlocked 2:92–104

    Google Scholar 

  51. Zilke JR, Loza ME, Janssen F (2016) DeepRED—rule extraction from deep neural networks. In: Calders T, Ceci M, Malerba D (eds) Discovery Science. DS 2016. Lecture notes in computer science, vol 9956. Springer, Cham

    Google Scholar 

  52. Tran SN, Garcez AD (2013) Knowledge extraction from deep belief networks for images. Workshop on neural-symbolic learning and reasoning

  53. Tran SN, Garcez A (2016) Deep logic networks: inserting and extracting knowledge from deep belief networks. IEEE Trans Neural Netw Learn Syst 29(2):246–258

    MathSciNet  Google Scholar 

  54. Biswas SK, Chakraborty M, Purkayastha B, Roy P, Thounaojam DM (2017) Rule extraction from training data using neural network. Int J Artif Intell Tools 26(3):1750006

    Google Scholar 

  55. Bondarenko A, Aleksejeva L, Jumutc V, Borisov A (2017) Classification tree extraction from trained artificial neural networks. Procedia Comput Sci 104:556–563

    Google Scholar 

  56. Chakraborty M, Biswas SK, Purkayastha B (2018) Recursive rule extraction from NN using reverse engineering technique. New Gener Comput 36(2):119–142

    Google Scholar 

  57. Bologna G (2019) A simple convolutional neural network with rule extraction. Appl Sci 9(12):2411

    Google Scholar 

  58. Chakraborty M, Biswas SK, Purkayastha B (2019) Rule extraction from neural network using input data ranges recursively. New Gener Comput 37(1):67–96

    Google Scholar 

  59. Chakraborty M, Biswas SK, Purkayastha B (2020) Rule extraction from neural network trained using deep belief network and back propagation. Knowl Inf Syst 62(9):3753–3781

    Google Scholar 

  60. Chakraborty M, Biswas SK, Purkayastha B (2022) Rule extraction using ensemble of neural network ensembles. Cogn Syst Res. https://doi.org/10.1016/j.cogsys.2022.07.004

    Article  Google Scholar 

  61. Matthews C, Jagielska I (1995) Fuzzy rule extraction from a trained multilayer neural network. In: Proceedings of ICNN'95—international conference on neural networks, Perth, WA, Australia, vol 2, pp 744–748

  62. Samuel HH, Hao X (2002) Extract intelligible and concise fuzzy rules from neural networks. Fuzzy Sets Syst 132(2):233–243

    MathSciNet  Google Scholar 

  63. Palade V, Neagu DC, Patton RJ (2001) Interpretation of trained neural networks by rule extraction. In: Reusch B (ed) Computational intelligence, theory and applications, fuzzy days 2001. Lecture notes in computer science. Springer, Berlin, pp 152–161

    Google Scholar 

  64. Kulluk S, ÖZbakiR L, BaykasoğLu A (2013) Fuzzy DIFACONN-miner: A novel approach for fuzzy rule extraction from neural networks. Expert Syst Appl 40(3):938–946

    Google Scholar 

  65. Hailesilassie T (2016) Rule extraction algorithm for deep neural networks: a review. Int J Comput Sci Inf Secur 14(7):376–381

    Google Scholar 

  66. Bologna G, Hayashi Y (2016) A rule extraction study on a neural network trained by deep learning. International joint conference on neural networks (IJCNN), Vancouver, BC, Canada, pp 668–675

  67. Löfström T, Johansson U, Niklasson L (2004) Rule extraction by seeing through the model. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), p 3316

  68. Diederich J, Barakat N (2004) Hybrid rule-extraction from support vector machines. IEEE Conf Cybern Intell Syst 2004:1271–1276

    Google Scholar 

  69. Shultz TR, Fahlman SE, Craw S, Andritsos P, Tsaparas P, Silva R, Drummond C, Ling CX, Sheng VS, Drummond C, Lanzi PL, Gama J, Wiegand RP, Sen P, Namata G, Bilgic M, Getoor L, He J, Jain S, Mueen A (2011) Confusion matrix. In: Sammut C, Webb GI (eds) Encyclopedia of machine learning. Springer, Boston

    Google Scholar 

  70. Ting KM (2017) Confusion matrix. In: Sammut C, Webb GI (eds) Encyclopedia of machine learning and data mining. Springer, Boston

    Google Scholar 

  71. Zhou ZH (2004) Rule extraction: using neural networks or for neural networks? J Comput Sci Technol 19(2):249–253

    Google Scholar 

  72. Anbananthen KSM, Pheng FCH, Subramaniam S, Sayeed S, Abusham EEAA (2012) A rule extraction algorithm that scales between fidelity and comprehensibility. Asian J Sci Res 5(3):121–132

    Google Scholar 

  73. Nazabal A, Olmos PM, Ghahramani Z, Valera I (2020) Handling incomplete heterogeneous data using vaes. Pattern Recogn 107:107501

    Google Scholar 

  74. Chakraborty M, Biswas SK (2022) Computer-aided heart disease diagnosis using recursive rule extraction algorithms from neural networks. Int J Comput Intell Appl 21(02):2250011

    Google Scholar 

  75. Chakraborty M, Biswas SK, Purkayastha B, Panigrahi CR, Pati B, Mohapatra P, Buyya R, Li K-C (2021) Progress in Advanced Computing and Intelligent Engineering Proceedings of ICACIE 2019 Volume 2 Performance Analysis of Recursive Rule Extraction Algorithms for Disease Prediction Springer Singapore Singapore, pp 331–341

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Manomita Chakraborty.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chakraborty, M. Explainable Neural Networks: Achieving Interpretability in Neural Models. Arch Computat Methods Eng 31, 3535–3550 (2024). https://doi.org/10.1007/s11831-024-10089-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11831-024-10089-4

Navigation