Abstract
Interpretability fosters trust when humans and artificial intelligence (AI) systems interact, and its value for neural networks in particular cannot be overstated. In this paper, we analyse the emergence and value of neural modularity as it relates to interpretability. We compare the modularity evolved through connectivity constraints in terms of network Q-scores and examine the interpretable qualities of the resultant networks with functional subset regression. The connectivity constraints compared here include those proposed by previous research as well as several novel variations formulated to express neuron input competition and connections per neuron variance. Networks were evolved using HyperNEAT on a free-form substrate. The results indicate that the connection costs successfully promote the evolution of neural modularity across a variety of tasks and show that the novel connection cost variations are competitive with previously explored connection costs. The interpretability assessment shows that while the evolved networks’ interpretable qualities are task dependent, two of the compared connection costs deliver statistically different functional module overlap distributions. However, recovered subnetwork module accuracies remain low, highlighting the key points for future research.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data availability
The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.
References
Goodfellow I, Bengio Y, Courville A (2016) Deep Learning. MIT Press, Cambridge
Das S (2017) CNN Architectures: LeNet, AlexNet, VGG, GoogLeNet, ResNet and more …. Medium, 2017. https://medium.com/@sidereal/cnns-architectures-lenet-alexnet-vgg-googlenet-resnet-and-more-666091488df5. Accessed 19 June 2019
Ross AS, Doshi-Velez F (2017) Improving the adversarial robustness and interpretability of deep neural networks by regularizing their input gradients. CoRR abs/1711.0, arXiv:1711.09404 [cs.LG]
Doshi-Velez F, Kim B (2017) Towards a rigorous science of interpretable machine learning. arXiv preprint arXiv:1702.08608
Lipton ZC (2016) The mythos of model interpretability. arXiv preprint arXiv:1606.03490
Gilpin LH, Bau D, Yuan BZ, Bajwa A, Specter M, Kagal L (2018) Explaining explanations: an approach to evaluating interpretability of machine learning. CoRR abs/1806.0. arXiv:1806.00069v1 [cs.AI]
Ribeiro MT, Singh S, Guestrin C (2016) Why should I trust you?: explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp 1135–1144
Zilke JR, Loza Mencía E, Janssen F (2016) DeepRED—rule extraction from deep neural networks BT. In: Calders T, Ceci M, Malerba D (eds) Discovery science. Springer, Cham, pp 457–473
Schmitz GPJ, Aldrich C, Gouws FS (1999) ANN-DT: an algorithm for extraction of decision trees from artificial neural networks. IEEE Trans Neural Netw 10(6):1392–1401. https://doi.org/10.1109/72.809084
Andrews R, Diederich J, Tickle AB (1995) Survey and critique of techniques for extracting rules from trained artificial neural networks. Knowl Based Syst 8(6):373–389. https://doi.org/10.1016/0950-7051(96)81920-4
Zeiler MD, Fergus R (2013) Visualizing and understanding convolutional networks. CoRR abs/1311.2, arXiv:1311.2901v3 [cs.CV]
Simonyan K, Vedaldi A, Zisserman A (2013) Deep inside convolutional networks: visualising image classification models and saliency maps. CoRR abs/1312.6, arXiv:1312.6034v2 [cs.CV]
Bolei Z, Khosla A, Lapedriza À, Oliva A, Torralba A (2015) Object detectors emerge in deep scene CNNs. 2015 International Conference on Learning Representations, May 7-9. https://doi.org/10.48550/arXiv.1412.6856
Nguyen AM, Dosovitskiy A, Yosinski J, Brox T, Clune J (2016) Synthesizing the preferred inputs for neurons in neural networks via deep generator networks. CoRR abs/1605.0, arXiv:1605.09304v5 [cs.NE]
Nie W, Zhang Y, Patel A (2018) A theoretical explanation for perplexing behaviors of backpropagation-based visualizations. CoRR abs/1805.0, arXiv:1805.07039v4 [cs.CV]
Geirhos R, Rubisch P, Michaelis C, Bethge M, Wichmann FA, Brendel W (2019) ImageNet-trained {CNN}s are biased towards texture; increasing shape bias improves accuracy and robustness. In: International conference on learning representations
Jo J, Bengio Y (2017) Measuring the tendency of CNNs to learn surface statistical regularities. CoRR abs/1711.1, arXiv:1711.11561v1 [cs.LG]
Antol S et al (2015) VQA: Visual question answering. CoRR abs/1505.0, arXiv:1505.00468v1 [cs.CL]
Hendricks LA, Akata Z, Rohrbach M, Donahue J, Schiele B, Darrell T (2016) Generating visual explanations. CoRR abs/1603.0, arXiv:1603.08507v1 [cs.CV]
Park DH et al (2018) Multimodal explanations: justifying decisions and pointing to the evidence. CoRR abs/1802.0. arXiv:1802.08129v1 [cs.AI]
Vaswani A et al (2017) Attention is all you need. In: Guyon I, Luxburg UV, Bengio S, Wallach H, Fergus R, Vishwanathan S, Garnett R (eds) Advances in neural information processing systems 30. Curran Associates, Inc., pp 5998–6008
Xiao T, Xu Y, Yang K, Zhang J, Peng Y, Zhang Z (2014) The application of two-level attention models in deep convolutional neural network for fine-grained image classification. CoRR abs/1411.6, arXiv:1411.6447v1 [cs.CV]
Lu J, Yang J, Batra D, Parikh D (2016) Hierarchical question-image co-attention for visual question answering. CoRR abs/1606.0, arXiv:1606.00061v5 [cs.CV]
Britz D (2016) Attention and memory in deep learning and NLP. WILDML. Available: http://www.wildml.com/2016/01/attention-and-memory-in-deep-learning-and-nlp/
Ross AS, Hughes MC, Doshi-Velez F (2017) Right for the right reasons: training differentiable models by constraining their explanations. CoRR abs/1703.0. Available: http://arxiv.org/abs/1703.03717
Kingma DP, Welling M (2013) Auto-encoding variational Bayes. arXiv e-prints arXiv:1312.6114
Chen X, Duan Y, Houthooft R, Schulman J, Sutskever I, Abbeel P (2016) InfoGAN: interpretable representation learning by information maximizing generative adversarial nets. CoRR abs/1606.0, arXiv:1606.03657v1 [cs.LG]
Zhang Q, Wu YN, Zhu S-C (2017) Interpretable convolutional neural networks. CoRR abs/1710.0, arXiv:1710.00935v4 [cs.CV]
Sabour S, Frosst N, Hinton GE (2017) Dynamic routing between capsules. CoRR abs/1710.0, arXiv:1710.09829v2 [cs.CV]
Khanna T (1990) Foundations of neural networks. Addison-Wesley Longman Publishing Co., Inc.
Wagner GP, Pavlicev M, Cheverud JM (2007) The road to modularity. Nat Rev Genet 8:921–931
Kashtan N, Alon U (2005) Spontaneous evolution of modularity and network motifs. Proc Natl Acad Sci 102(39):13773–13778. https://doi.org/10.1073/pnas.0503610102
Paranyushkin D, Labs N (2014) Metastability of cognition in the body-mind-environment network. Nodus Labs. Paris, France, available online: https://noduslabs.com/wp-content/uploads/2014/02/Metastability-Cognition-Body-Mind-Environement-Network.pdf
Plaut DC, Hinton GE (1987) Learning sets of filters using back-propagation. Comput Speech Lang 2(1):35–61. https://doi.org/10.1016/0885-2308(87)90026-X
Bullinaria JA (2002) To modularize or not to modularize? In: Proceedings of the 2002 U.K. workshop on computational intelligence: UKCI-02, pp 3–10
on Decomposition, Himmelblau DM (1973) Decomposition of large-scale problems. Editor: David M. Himmelblau. North-Holland Pub. Co.; American Elsevier Pub. Co, Amsterdam
Lipson H, Pollack JB, Suh NP (2002) On the origin of modular variation. Evolution (NY) 56(8):1549–1556. Available: http://www.jstor.org/stable/3061537
Jacobs RA, Jordan MI (1992) Computational consequences of a Bias toward short connections. J Cogn Neurosci 4(4):323–336. https://doi.org/10.1162/jocn.1992.4.4.323
Ellefsen KO, Huizinga J, Tørresen J (2020) Guiding neuroevolution with structural objectives. Evol Comput 28(1):115–140. https://doi.org/10.1162/evco_a_00250
Huizinga J, Mouret J-B, Clune J (2016) Does aligning phenotypic and genotypic modularity improve the evolution of neural networks? In: Proceedings of the genetic and evolutionary computation conference 2016, in GECCO’16. ACM, New York, pp 125–132. https://doi.org/10.1145/2908812.2908836.
Yao X (1993) A review of evolutionary artificial neural networks. Int J Intell Syst 4:539–567
Koza JR (1992) Genetic programming: on the programming of computers by means of natural selection. MIT Press, Cambridge
Fogel DB (2005) Evolutionary computation: toward a new philosophy of machine intelligence, 3rd edn. Wiley-IEEE Press
Bullinaria J (2007) Understanding the emergence of modularity in neural systems. Cogn Sci 31:673–695. https://doi.org/10.1080/15326900701399939
Pepper J (2000) The evolution of modularity in genome architecture. Artif Life ALIFE
Wagner GP, Altenberg L (1996) Perspective: complex adaptations and the evolution of evolvability. Evolution (NY) 50(3):967–976. https://doi.org/10.2307/2410639
Kashtan N, Noor E, Alon U (2007) Varying environments can speed up evolution. Proc Natl Acad Sci 104(34):13711–13716. https://doi.org/10.1073/pnas.0611630104
Lipson H, Pollack JB, Suh NP (2002) On the origin of modular variation. Evolution (NY) 56(8):1549–1556
Kashtan N, Parter M, Dekel E, Mayo AE, Alon U (2009) Extinctions in heterogeneous environments and the evolution of modularity. Evolution (NY) 63(8):1964–1975. https://doi.org/10.1111/j.1558-5646.2009.00684.x
Ellefsen KO, Torresen J (2017) Evolving neural networks with multiple internal models. https://doi.org/10.7551/ecal_a_025
Høverstad BA (2011) Noise and the evolution of neural network modularity. Artif Life 17(1):33–50. https://doi.org/10.1162/artl_a_00016
Li S, Yuan J (2011) The modularity in freeform evolving neural networks. In: 2011 IEEE congress of evolutionary computation (CEC), pp 2605–2610. https://doi.org/10.1109/CEC.2011.5949943
Marengo L, Pasquali C, Valente M (2005) Modularity: understanding the development and evolution of complex natural systems. Callebaut W, Rasskin-Gutman D (eds.). MIT Press.
Di Ferdinando A, Calabretta R, Parisi D (2001) Evolving modular architectures for neural networks BT. In: French RM, Sougné JP (eds) Connectionist models of learning, development and evolution. Springer, London, pp 253–262
Kim J, Kim M (2001) The mathematical structure of characters and modularity. In: Wagner GP (ed) The character concept in evolutionary biology. Academic Press, London
Rumelhart DE (1988) Lecture at the 1988 connectionist models summer school. Pittsburgh
Bullinaria J (2001) Simulating the evolution of modular neural systems. Proceedings of the Annual Meeting of the Cognitive Science Society, 23. Retrieved from https://escholarship.org/uc/item/0jb7v7q9
Clune J, Mouret J-B, Lipson H (2012) The evolutionary origins of modularity. arXiv e-prints arXiv:1207.2743
Clune J, Beckmann BE, McKinley PK, Ofria C (2010) Investigating whether hyperNEAT produces modular neural networks. In: Proceedings of the 12th annual conference on genetic and evolutionary computation, GECCO’10. ACM, New York, pp 635–642. https://doi.org/10.1145/1830483.1830598.
Huizinga J, Clune J, Mouret J-B (2014) Evolving neural networks that are both modular and regular: HyperNEAT plus the connection cost technique. In: Proceedings of the 2014 annual conference on genetic and evolutionary computation, GECCO’14. ACM, New York, pp 697–704. https://doi.org/10.1145/2576768.2598232.
Verbancsics P, Stanley KO (2011) Constraining connectivity to encourage modularity in HyperNEAT. In: Proceedings of the 13th annual conference on genetic and evolutionary computation, GECCO’11. ACM, New York, pp 1483–1490. https://doi.org/10.1145/2001576.2001776
Ellefsen KO, Mouret J-B, Clune J (2015) Neural modularity helps organisms evolve to learn new skills without forgetting old skills. PLoS Comput Biol 11(4):e1004128–e1004128. https://doi.org/10.1371/journal.pcbi.1004128
Stanley KO, D’Ambrosio DB, Gauci J (2009) A hypercube-based encoding for evolving large-scale neural networks. Artif Life 15(2):185–212. https://doi.org/10.1162/artl.2009.15.2.15202
Stanley KO, Miikkulainen R (2003) A taxonomy for artificial embryogeny. Artif Life 9(2):93–130. https://doi.org/10.1162/106454603322221487
Mjolsness E, Sharp DH, Alpert BK (1989) Scaling, machine learning, and genetic neural nets. Adv Appl Math 10(2):137–163. https://doi.org/10.1016/0196-8858(89)90008-0
Luerssen M, Powers D (2007) Graph design by graph grammar evolution. https://doi.org/10.1109/CEC.2007.4424497
Halder G, Callaerts P, Gehring WJ (1995) Induction of ectopic eyes by targeted expression of the eyeless gene in drosophila. Science (1979) 267(5205):1788–1792
Bentley P, Kumar S (1999) Three ways to grow designs: a comparison of embryogenies for an evolutionary design problem. In: Proceedings of the 1st annual conference on genetic and evolutionary computation—volume 1, in GECCO’99. Morgan Kaufmann Publishers Inc., San Francisco, pp 35–43
Stanley KO (2007) Compositional pattern producing networks: a novel abstraction of development. Genet Program Evolvable Mach 8(2):131–162. https://doi.org/10.1007/s10710-007-9028-8
Velez R, Clune J (2016) Identifying core functional networks and functional modules within artificial neural networks via subsets regression. In: Proceedings of the genetic and evolutionary computation conference 2016, in GECCO’16. ACM, New York, pp 181–188. https://doi.org/10.1145/2908812.2908839
McIntyre A, Kallada M, Miguel CG, da Silva CF. neat-python [computer software], available online: https://github.com/CodeReclaimers/neat-python
Newman MEJ (2006) Modularity and community structure in networks. Proc Natl Acad Sci 103(23):8577–8582. https://doi.org/10.1073/pnas.0601602103
Leicht EA, Newman MEJ (2008) Community structure in directed networks. Phys Rev Lett. https://doi.org/10.1103/physrevlett.100.118703
Miettinen K (1999) Nonlinear multiobjective optimization. In: International series in operations research & management science. Springer US
Author information
Authors and Affiliations
Contributions
Both authors made substantial contributions to the conception or design of the work, in the interpretation of data, and writing or revising of the manuscript. AW made substantial contributions to the acquisition and analysis of data. Both authors approved this version of the paper.
Corresponding author
Ethics declarations
Conflict of interest
The authors did not receive support from any organisation for the submitted work. The authors have no competing interests to declare that are relevant to the content of this article.
Ethical and informed consent for data used
The research did not involve human participants or animals. All data were generated by the authors. We complied with ethical standards where applicable.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
van der Merwe, A.W., Vandenheever, D. Evolving interpretable neural modularity in free-form multilayer perceptrons through connection costs. Neural Comput & Applic 36, 1459–1476 (2024). https://doi.org/10.1007/s00521-023-09117-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-023-09117-4