Nothing Special   »   [go: up one dir, main page]

Skip to main content

Advertisement

Log in

Evolving interpretable neural modularity in free-form multilayer perceptrons through connection costs

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Interpretability fosters trust when humans and artificial intelligence (AI) systems interact, and its value for neural networks in particular cannot be overstated. In this paper, we analyse the emergence and value of neural modularity as it relates to interpretability. We compare the modularity evolved through connectivity constraints in terms of network Q-scores and examine the interpretable qualities of the resultant networks with functional subset regression. The connectivity constraints compared here include those proposed by previous research as well as several novel variations formulated to express neuron input competition and connections per neuron variance. Networks were evolved using HyperNEAT on a free-form substrate. The results indicate that the connection costs successfully promote the evolution of neural modularity across a variety of tasks and show that the novel connection cost variations are competitive with previously explored connection costs. The interpretability assessment shows that while the evolved networks’ interpretable qualities are task dependent, two of the compared connection costs deliver statistically different functional module overlap distributions. However, recovered subnetwork module accuracies remain low, highlighting the key points for future research.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Data availability

The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.

References

  1. Goodfellow I, Bengio Y, Courville A (2016) Deep Learning. MIT Press, Cambridge

    Google Scholar 

  2. Das S (2017) CNN Architectures: LeNet, AlexNet, VGG, GoogLeNet, ResNet and more …. Medium, 2017. https://medium.com/@sidereal/cnns-architectures-lenet-alexnet-vgg-googlenet-resnet-and-more-666091488df5. Accessed 19 June 2019

  3. Ross AS, Doshi-Velez F (2017) Improving the adversarial robustness and interpretability of deep neural networks by regularizing their input gradients. CoRR abs/1711.0, arXiv:1711.09404 [cs.LG]

  4. Doshi-Velez F, Kim B (2017) Towards a rigorous science of interpretable machine learning. arXiv preprint arXiv:1702.08608

  5. Lipton ZC (2016) The mythos of model interpretability. arXiv preprint arXiv:1606.03490

  6. Gilpin LH, Bau D, Yuan BZ, Bajwa A, Specter M, Kagal L (2018) Explaining explanations: an approach to evaluating interpretability of machine learning. CoRR abs/1806.0. arXiv:1806.00069v1 [cs.AI]

  7. Ribeiro MT, Singh S, Guestrin C (2016) Why should I trust you?: explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp 1135–1144

  8. Zilke JR, Loza Mencía E, Janssen F (2016) DeepRED—rule extraction from deep neural networks BT. In: Calders T, Ceci M, Malerba D (eds) Discovery science. Springer, Cham, pp 457–473

  9. Schmitz GPJ, Aldrich C, Gouws FS (1999) ANN-DT: an algorithm for extraction of decision trees from artificial neural networks. IEEE Trans Neural Netw 10(6):1392–1401. https://doi.org/10.1109/72.809084

    Article  Google Scholar 

  10. Andrews R, Diederich J, Tickle AB (1995) Survey and critique of techniques for extracting rules from trained artificial neural networks. Knowl Based Syst 8(6):373–389. https://doi.org/10.1016/0950-7051(96)81920-4

    Article  Google Scholar 

  11. Zeiler MD, Fergus R (2013) Visualizing and understanding convolutional networks. CoRR abs/1311.2, arXiv:1311.2901v3 [cs.CV]

  12. Simonyan K, Vedaldi A, Zisserman A (2013) Deep inside convolutional networks: visualising image classification models and saliency maps. CoRR abs/1312.6, arXiv:1312.6034v2 [cs.CV]

  13. Bolei Z, Khosla A, Lapedriza À, Oliva A, Torralba A (2015) Object detectors emerge in deep scene CNNs. 2015 International Conference on Learning Representations, May 7-9. https://doi.org/10.48550/arXiv.1412.6856

  14. Nguyen AM, Dosovitskiy A, Yosinski J, Brox T, Clune J (2016) Synthesizing the preferred inputs for neurons in neural networks via deep generator networks. CoRR abs/1605.0, arXiv:1605.09304v5 [cs.NE]

  15. Nie W, Zhang Y, Patel A (2018) A theoretical explanation for perplexing behaviors of backpropagation-based visualizations. CoRR abs/1805.0, arXiv:1805.07039v4 [cs.CV]

  16. Geirhos R, Rubisch P, Michaelis C, Bethge M, Wichmann FA, Brendel W (2019) ImageNet-trained {CNN}s are biased towards texture; increasing shape bias improves accuracy and robustness. In: International conference on learning representations

  17. Jo J, Bengio Y (2017) Measuring the tendency of CNNs to learn surface statistical regularities. CoRR abs/1711.1, arXiv:1711.11561v1 [cs.LG]

  18. Antol S et al (2015) VQA: Visual question answering. CoRR abs/1505.0, arXiv:1505.00468v1 [cs.CL]

  19. Hendricks LA, Akata Z, Rohrbach M, Donahue J, Schiele B, Darrell T (2016) Generating visual explanations. CoRR abs/1603.0, arXiv:1603.08507v1 [cs.CV]

  20. Park DH et al (2018) Multimodal explanations: justifying decisions and pointing to the evidence. CoRR abs/1802.0. arXiv:1802.08129v1 [cs.AI]

  21. Vaswani A et al (2017) Attention is all you need. In: Guyon I, Luxburg UV, Bengio S, Wallach H, Fergus R, Vishwanathan S, Garnett R (eds) Advances in neural information processing systems 30. Curran Associates, Inc., pp 5998–6008

  22. Xiao T, Xu Y, Yang K, Zhang J, Peng Y, Zhang Z (2014) The application of two-level attention models in deep convolutional neural network for fine-grained image classification. CoRR abs/1411.6, arXiv:1411.6447v1 [cs.CV]

  23. Lu J, Yang J, Batra D, Parikh D (2016) Hierarchical question-image co-attention for visual question answering. CoRR abs/1606.0, arXiv:1606.00061v5 [cs.CV]

  24. Britz D (2016) Attention and memory in deep learning and NLP. WILDML. Available: http://www.wildml.com/2016/01/attention-and-memory-in-deep-learning-and-nlp/

  25. Ross AS, Hughes MC, Doshi-Velez F (2017) Right for the right reasons: training differentiable models by constraining their explanations. CoRR abs/1703.0. Available: http://arxiv.org/abs/1703.03717

  26. Kingma DP, Welling M (2013) Auto-encoding variational Bayes. arXiv e-prints arXiv:1312.6114

  27. Chen X, Duan Y, Houthooft R, Schulman J, Sutskever I, Abbeel P (2016) InfoGAN: interpretable representation learning by information maximizing generative adversarial nets. CoRR abs/1606.0, arXiv:1606.03657v1 [cs.LG]

  28. Zhang Q, Wu YN, Zhu S-C (2017) Interpretable convolutional neural networks. CoRR abs/1710.0, arXiv:1710.00935v4 [cs.CV]

  29. Sabour S, Frosst N, Hinton GE (2017) Dynamic routing between capsules. CoRR abs/1710.0, arXiv:1710.09829v2 [cs.CV]

  30. Khanna T (1990) Foundations of neural networks. Addison-Wesley Longman Publishing Co., Inc.

  31. Wagner GP, Pavlicev M, Cheverud JM (2007) The road to modularity. Nat Rev Genet 8:921–931

    Article  Google Scholar 

  32. Kashtan N, Alon U (2005) Spontaneous evolution of modularity and network motifs. Proc Natl Acad Sci 102(39):13773–13778. https://doi.org/10.1073/pnas.0503610102

    Article  Google Scholar 

  33. Paranyushkin D, Labs N (2014) Metastability of cognition in the body-mind-environment network. Nodus Labs. Paris, France, available online: https://noduslabs.com/wp-content/uploads/2014/02/Metastability-Cognition-Body-Mind-Environement-Network.pdf

  34. Plaut DC, Hinton GE (1987) Learning sets of filters using back-propagation. Comput Speech Lang 2(1):35–61. https://doi.org/10.1016/0885-2308(87)90026-X

    Article  Google Scholar 

  35. Bullinaria JA (2002) To modularize or not to modularize? In: Proceedings of the 2002 U.K. workshop on computational intelligence: UKCI-02, pp 3–10

  36. on Decomposition, Himmelblau DM (1973) Decomposition of large-scale problems. Editor: David M. Himmelblau. North-Holland Pub. Co.; American Elsevier Pub. Co, Amsterdam

  37. Lipson H, Pollack JB, Suh NP (2002) On the origin of modular variation. Evolution (NY) 56(8):1549–1556. Available: http://www.jstor.org/stable/3061537

  38. Jacobs RA, Jordan MI (1992) Computational consequences of a Bias toward short connections. J Cogn Neurosci 4(4):323–336. https://doi.org/10.1162/jocn.1992.4.4.323

    Article  Google Scholar 

  39. Ellefsen KO, Huizinga J, Tørresen J (2020) Guiding neuroevolution with structural objectives. Evol Comput 28(1):115–140. https://doi.org/10.1162/evco_a_00250

  40. Huizinga J, Mouret J-B, Clune J (2016) Does aligning phenotypic and genotypic modularity improve the evolution of neural networks? In: Proceedings of the genetic and evolutionary computation conference 2016, in GECCO’16. ACM, New York, pp 125–132. https://doi.org/10.1145/2908812.2908836.

  41. Yao X (1993) A review of evolutionary artificial neural networks. Int J Intell Syst 4:539–567

    Article  Google Scholar 

  42. Koza JR (1992) Genetic programming: on the programming of computers by means of natural selection. MIT Press, Cambridge

    Google Scholar 

  43. Fogel DB (2005) Evolutionary computation: toward a new philosophy of machine intelligence, 3rd edn. Wiley-IEEE Press

    Book  Google Scholar 

  44. Bullinaria J (2007) Understanding the emergence of modularity in neural systems. Cogn Sci 31:673–695. https://doi.org/10.1080/15326900701399939

    Article  Google Scholar 

  45. Pepper J (2000) The evolution of modularity in genome architecture. Artif Life ALIFE

  46. Wagner GP, Altenberg L (1996) Perspective: complex adaptations and the evolution of evolvability. Evolution (NY) 50(3):967–976. https://doi.org/10.2307/2410639

    Article  Google Scholar 

  47. Kashtan N, Noor E, Alon U (2007) Varying environments can speed up evolution. Proc Natl Acad Sci 104(34):13711–13716. https://doi.org/10.1073/pnas.0611630104

    Article  Google Scholar 

  48. Lipson H, Pollack JB, Suh NP (2002) On the origin of modular variation. Evolution (NY) 56(8):1549–1556

    Google Scholar 

  49. Kashtan N, Parter M, Dekel E, Mayo AE, Alon U (2009) Extinctions in heterogeneous environments and the evolution of modularity. Evolution (NY) 63(8):1964–1975. https://doi.org/10.1111/j.1558-5646.2009.00684.x

    Article  Google Scholar 

  50. Ellefsen KO, Torresen J (2017) Evolving neural networks with multiple internal models. https://doi.org/10.7551/ecal_a_025

  51. Høverstad BA (2011) Noise and the evolution of neural network modularity. Artif Life 17(1):33–50. https://doi.org/10.1162/artl_a_00016

    Article  Google Scholar 

  52. Li S, Yuan J (2011) The modularity in freeform evolving neural networks. In: 2011 IEEE congress of evolutionary computation (CEC), pp 2605–2610. https://doi.org/10.1109/CEC.2011.5949943

  53. Marengo L, Pasquali C, Valente M (2005) Modularity: understanding the development and evolution of complex natural systems. Callebaut W, Rasskin-Gutman D (eds.). MIT Press.

  54. Di Ferdinando A, Calabretta R, Parisi D (2001) Evolving modular architectures for neural networks BT. In: French RM, Sougné JP (eds) Connectionist models of learning, development and evolution. Springer, London, pp 253–262

    Chapter  Google Scholar 

  55. Kim J, Kim M (2001) The mathematical structure of characters and modularity. In: Wagner GP (ed) The character concept in evolutionary biology. Academic Press, London

    Google Scholar 

  56. Rumelhart DE (1988) Lecture at the 1988 connectionist models summer school. Pittsburgh

  57. Bullinaria J (2001) Simulating the evolution of modular neural systems. Proceedings of the Annual Meeting of the Cognitive Science Society, 23. Retrieved from https://escholarship.org/uc/item/0jb7v7q9

  58. Clune J, Mouret J-B, Lipson H (2012) The evolutionary origins of modularity. arXiv e-prints arXiv:1207.2743

  59. Clune J, Beckmann BE, McKinley PK, Ofria C (2010) Investigating whether hyperNEAT produces modular neural networks. In: Proceedings of the 12th annual conference on genetic and evolutionary computation, GECCO’10. ACM, New York, pp 635–642. https://doi.org/10.1145/1830483.1830598.

  60. Huizinga J, Clune J, Mouret J-B (2014) Evolving neural networks that are both modular and regular: HyperNEAT plus the connection cost technique. In: Proceedings of the 2014 annual conference on genetic and evolutionary computation, GECCO’14. ACM, New York, pp 697–704. https://doi.org/10.1145/2576768.2598232.

  61. Verbancsics P, Stanley KO (2011) Constraining connectivity to encourage modularity in HyperNEAT. In: Proceedings of the 13th annual conference on genetic and evolutionary computation, GECCO’11. ACM, New York, pp 1483–1490. https://doi.org/10.1145/2001576.2001776

  62. Ellefsen KO, Mouret J-B, Clune J (2015) Neural modularity helps organisms evolve to learn new skills without forgetting old skills. PLoS Comput Biol 11(4):e1004128–e1004128. https://doi.org/10.1371/journal.pcbi.1004128

    Article  Google Scholar 

  63. Stanley KO, D’Ambrosio DB, Gauci J (2009) A hypercube-based encoding for evolving large-scale neural networks. Artif Life 15(2):185–212. https://doi.org/10.1162/artl.2009.15.2.15202

    Article  Google Scholar 

  64. Stanley KO, Miikkulainen R (2003) A taxonomy for artificial embryogeny. Artif Life 9(2):93–130. https://doi.org/10.1162/106454603322221487

    Article  Google Scholar 

  65. Mjolsness E, Sharp DH, Alpert BK (1989) Scaling, machine learning, and genetic neural nets. Adv Appl Math 10(2):137–163. https://doi.org/10.1016/0196-8858(89)90008-0

    Article  MathSciNet  Google Scholar 

  66. Luerssen M, Powers D (2007) Graph design by graph grammar evolution. https://doi.org/10.1109/CEC.2007.4424497

  67. Halder G, Callaerts P, Gehring WJ (1995) Induction of ectopic eyes by targeted expression of the eyeless gene in drosophila. Science (1979) 267(5205):1788–1792

    Google Scholar 

  68. Bentley P, Kumar S (1999) Three ways to grow designs: a comparison of embryogenies for an evolutionary design problem. In: Proceedings of the 1st annual conference on genetic and evolutionary computation—volume 1, in GECCO’99. Morgan Kaufmann Publishers Inc., San Francisco, pp 35–43

  69. Stanley KO (2007) Compositional pattern producing networks: a novel abstraction of development. Genet Program Evolvable Mach 8(2):131–162. https://doi.org/10.1007/s10710-007-9028-8

    Article  Google Scholar 

  70. Velez R, Clune J (2016) Identifying core functional networks and functional modules within artificial neural networks via subsets regression. In: Proceedings of the genetic and evolutionary computation conference 2016, in GECCO’16. ACM, New York, pp 181–188. https://doi.org/10.1145/2908812.2908839

  71. McIntyre A, Kallada M, Miguel CG, da Silva CF. neat-python [computer software], available online: https://github.com/CodeReclaimers/neat-python

  72. Newman MEJ (2006) Modularity and community structure in networks. Proc Natl Acad Sci 103(23):8577–8582. https://doi.org/10.1073/pnas.0601602103

    Article  Google Scholar 

  73. Leicht EA, Newman MEJ (2008) Community structure in directed networks. Phys Rev Lett. https://doi.org/10.1103/physrevlett.100.118703

    Article  Google Scholar 

  74. Miettinen K (1999) Nonlinear multiobjective optimization. In: International series in operations research & management science. Springer US

Download references

Author information

Authors and Affiliations

Authors

Contributions

Both authors made substantial contributions to the conception or design of the work, in the interpretation of data, and writing or revising of the manuscript. AW made substantial contributions to the acquisition and analysis of data. Both authors approved this version of the paper.

Corresponding author

Correspondence to David Vandenheever.

Ethics declarations

Conflict of interest

The authors did not receive support from any organisation for the submitted work. The authors have no competing interests to declare that are relevant to the content of this article.

Ethical and informed consent for data used

The research did not involve human participants or animals. All data were generated by the authors. We complied with ethical standards where applicable.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (DOCX 8107 KB)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

van der Merwe, A.W., Vandenheever, D. Evolving interpretable neural modularity in free-form multilayer perceptrons through connection costs. Neural Comput & Applic 36, 1459–1476 (2024). https://doi.org/10.1007/s00521-023-09117-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-023-09117-4

Keywords

Navigation