Abstract
Deep neural networks are successfully used for object and face recognition in images and videos. In order to be able to apply such networks in practice, for example in hospitals as a pain recognition tool, the current procedures are only suitable to a limited extent. The advantage of deep neural methods is that they can learn complex non-linear relationships between raw data and target classes without limiting themselves to a set of hand-crafted features provided by humans. However, the disadvantage is that due to the complexity of these networks, it is not possible to interpret the knowledge that is stored inside the network. It is a black-box learning procedure. Explainable Artificial Intelligence (AI) approaches mitigate this problem by extracting explanations for decisions and representing them in a human-interpretable form. The aim of this paper is to investigate the explainable AI methods Layer-wise Relevance Propagation (LRP) and Local Interpretable Model-agnostic Explanations (LIME). These approaches are applied to explain how a deep neural network distinguishes facial expressions of pain from facial expressions of emotions such as happiness and disgust.
Zusammenfassung
Tiefe neuronale Netze werden erfolgreich für die Objekt- und Gesichtserkennung in Bildern und Videos verwendet. Die derzeiten Ansätze sind jedoch nur begrenzt in der Praxis, zum Beispiel zur Schmerzerkennung, verwendbar. Der Vorteil von Deep Learning Methoden liegt darin, dass sie in der Lage sind, komplexe, nichtlineare Zusammenhänge zwischen Rohdaten und Zielklassen zu lernen, ohne dass sie auf händisch durch Menschen generierte Merkmale angewiesen sind. Der Nachteil dieser Netzwerke besteht darin, dass sie sehr komplex sind und daher für Menschen schwer zu verstehen ist, warum das Netz zu seiner Entscheidung gekommen ist. Man bezeichnet diese Netzwerke deshalb auch als black-boxes. Methoden der erklärbaren künstlichen Intelligenz (AI) nehmen sich diesem Problem an, indem sie Erklärungen für Entscheidungen generieren und diese für Menschen in einer interpretierbaren Form darstellen. Das Ziel dieses Artikels ist es, die erklärbaren AI Methoden Layer-wise Relevance Propagation (LRP) und Local Interpretable Model-agnostic Explanations (LIME) zu nutzen, um die Entscheidungen eines tiefen neuronalen Netzes zu erklären, dass schmerzhafte Gesichtsausdrücke von Freude und Ekel darstellenden Gesichtern unterscheidet.
About the authors
Katharina Weitz received a Master of Science in Psychology and a Master of Science in Computing in the Humanities (Applied Computer Science) at the University of Bamberg, Germany. She is currently working at the University of Augsburg at the chair for Human-Centered Multimedia. She is interested in machine learning topics in the field of social robotics and virtual agents. The influence of explainability and transparency of intelligent systems on people’s trust is a central point of her research activities. She supports a human-centered usage of artificial intelligence and delves into ethical issues. In addition to her research activities, the communication of research knowledge to the general public in the form of lectures, workshops and exhibitions is an important concern to her.
Teena Hassan received her Bachelor of Technology degree in Computer Science and Engineering from Cochin University of Science and Technology in Kerala, India, in the year 2006. After graduation, she worked as a Project Engineer in the Telecom/Datacom domain. In 2014, she received her Master of Science degree in Autonomous Systems from the Bonn-Rhein-Sieg University of Applied Sciences, Sankt Augustin. After graduation, she joined Fraunhofer IIS, in Erlangen, where she conducted research in the field of automatic analysis of facial action units, with a special focus on modeling facial muscle motions, fusing multiple sources of facial expression information, and modeling uncertainty in measurements. Her research interests include facial expression analysis, sensor noise modeling, sensor fusion, and state estimation. She is currently a Research Associate at the Bielefeld University, conducting research on interaction architectures for social robots.
Ute Schmid holds a diploma in psychology and a diploma in computer science, both from Technical University Berlin (TUB), Germany. She received her doctoral degree (Dr. rer.nat.) in computer science from TUB in 1994 and her habilitation in computer science in 2002. From 1994 to 2001 she was assistant professor (wissenschaftliche Assistentin) at the AI/Machine Learning group, Department of Computer Science, TUB. Afterwards she worked as lecturer (akademische Rätin) for Intelligent Systems at the Department of Mathematics and Computer Science at University Osnabrück. Since 2004 she holds a professorship of Applied Computer Science/Cognitive Systems at the University of Bamberg. Research interests of Ute Schmid are mainly in the domain of comprehensible machine learning, explainable AI, and high-level learning on relational data, especially inductive programming, knowledge level learning from planning, learning structural prototypes, analogical problem solving and learning. Further research is on various applications of machine learning (e. g., classifier learning from medical data and for facial expressions) and empirical and experimental work on high-level cognitive processes. Ute Schmid dedicates a significant amount of her time to measures supporting women in computer science and to promote computer science as a topic in elementary, primary, and secondary education.
Dr. Garbas received the Dipl.-Ing. and Dr.-Ing. (summa cum laude) degrees in electrical engineering from Friedrich-Alexander University Erlangen-Nuremberg, Germany, in 2004 and 2010, respectively. In 2010 he joined Fraunhofer Institute for Integrated Circuits IIS, where he was appointed head of the group Intelligent Systems 2011 and deputy head of department electronic imaging in 2012, respectively. He is responsible for industrial and public research project as well as software licensing in the area of real-time computer vision, affective computing and facial analysis.
References
1. Maximilian Alber, Sebastian Lapuschkin, Philipp Seegerer, Miriam Hägele, Kristof T Schütt, Grégoire Montavon, Wojciech Samek, Klaus-Robert Müller, Sven Dähne, and Pieter-Jan Kindermans. innvestigate neural networks! arXiv preprint arXiv:1808.04260, 2018.Search in Google Scholar
2. Nalini Ambady and Robert Rosenthal. Thin slices of expressive behavior as predictors of interpersonal consequences: A meta-analysis. Psychological Bulletin, 111(2):256–274, 1992.10.1037/0033-2909.111.2.256Search in Google Scholar
3. Hillel Aviezer, Yaacov Trope, and Alexander Todorov. Body cues, not facial expressions, discriminate between intense positive and negative emotions. Science, 338(6111):1225–1229, 2012.10.1126/science.1224313Search in Google Scholar PubMed
4. Sebastian Bach, Alexander Binder, Grégoire Montavon, Frederick Klauschen, Klaus-Robert Müller, and Wojciech Samek. On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation. PloS one, 10(7):e0130140, 2015.10.1371/journal.pone.0130140Search in Google Scholar PubMed PubMed Central
5. Sebastian Bach, Alexander Binder, Klaus-Robert Müller, and Wojciech Samek. Controlling explanatory heatmap resolution and semantics via decomposition depth. In Proceedings of the International Conference on Image Processing, pages 2271–2275. IEEE, 2016.Search in Google Scholar
6. Sheryl Brahnam, Chao-Fa Chuang, Frank Y Shih, and Melinda R Slack. Machine recognition and representation of neonatal facial displays of acute pain. Artificial Intelligence in Medicine, 36(3):211–222, 2006.10.1016/j.artmed.2004.12.003Search in Google Scholar PubMed
7. Bradley Efron, Trevor Hastie, Iain Johnstone, Robert Tibshirani, et al.Least angle regression. The Annals of statistics, 32(2):407–499, 2004.10.1214/009053604000000067Search in Google Scholar
8. Paul Ekman and Erika L Rosenberg. What the face reveals: Basic and applied studies of spontaneous expression using the Facial Action Coding System (FACS). Oxford University Press, USA, 1997.10.1093/oso/9780195104462.001.0001Search in Google Scholar
9. Chris Frith. Role of facial expressions in social interactions. Philosophical Transactions of the Royal Society B: Biological Sciences, 364(1535):3453–3458, 2009.10.1098/rstb.2009.0142Search in Google Scholar PubMed PubMed Central
10. David H Hubel and Torsten N Wiesel. Receptive fields, binocular interaction, and functional architecture in the cat’s visual cortex. Journal of Physiology, 160(1):106–154, 1962.10.1113/jphysiol.1962.sp006837Search in Google Scholar PubMed PubMed Central
11. Maximilian Hans Kohlbrenner. On the stability of neural network explanations, Apr 2017. Bachelor’s Thesis.Search in Google Scholar
12. H Chad Lane, Mark G Core, Michael Van Lent, Steve Solomon, and Dave Gomboc. Explainable artificial intelligence for training and tutoring. Technical report, University of Southern California Marina del Rey CA Institute for Creative Technologies, 2005.Search in Google Scholar
13. Sebastian Lapuschkin, Alexander Binder, Klaus-Robert Müller, and Wojciech Samek. Understanding and comparing deep neural networks for age and gender classification. In Proceedings of the International Conference on Computer Vision, pages 1629–1638, 2017.Search in Google Scholar
14. Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. Deep learning. Nature, 521:436–444, 2015.10.1038/nature14539Search in Google Scholar PubMed
15. Yann LeCun, Bernhard E Boser, John S Denker, Donnie Henderson, Richard E Howard, Wayne E Hubbard, and Lawrence D Jackel. Handwritten digit recognition with a back-propagation network. In Advances in Neural Information Processing Systems, pages 396–404, 1990.Search in Google Scholar
16. Patrick Lucey, Jeffrey F Cohn, Kenneth M Prkachin, Patricia E Solomon, and Iain Matthews. Painful data: The unbc-mcmaster shoulder pain expression archive database. In Proceedings of the International Conference on Automatic Face & Gesture Recognition and Workshops, pages 57–64. IEEE, 2011.10.1109/FG.2011.5771462Search in Google Scholar
17. Grégoire Montavon, Wojciech Samek, and Klaus-Robert Müller. Methods for interpreting and understanding deep neural networks. Digital Signal Processing, 73:1–15, 2017.10.1016/j.dsp.2017.10.011Search in Google Scholar
18. Omkar M Parkhi, Andrea Vedaldi, and Andrew Zisserman. Deep face recognition. In BMVC, volume 1, pages 1–12, 2015.10.5244/C.29.41Search in Google Scholar
19. Chirag Ravat and Shital A Solanki. Survey on different methods to improve accuracy of the facial expression recognition using artificial neural networks. In Proceedings of the National Conference on Advanced Research Trends in Information and Computing Technologies, volume 4, 2018.Search in Google Scholar
20. Marco Tulio Ribeiro, Singh Sameer, and Carlos Guestrin. Lime. https://github.com/marcotcr/lime/, 2017.Search in Google Scholar
21. Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. Why should i trust you?: Explaining the predictions of any classifier. In Proceedings of the 22nd international conference on knowledge discovery and data mining, pages 1135–1144. ACM, 2016.10.1145/2939672.2939778Search in Google Scholar
22. Ute Schmid. Inductive programming as approach to comprehensible machine learning. In Proceedings of the 7th workshop on dynamics of knowledge and belief (DKB-2018) and the 6th workshop KI & Kognition (KIK-2018), co-located with 41st German conference on artificial intelligence, volume 2194, 2018.Search in Google Scholar
23. Jürgen Schmidhuber. Deep learning in neural networks: An overview. Neural networks, 61:85–117, 2015.10.1016/j.neunet.2014.09.003Search in Google Scholar PubMed
24. Michael Siebers and Ute Schmid. Please delete that! why should I? KI – Künstliche Intelligenz, 2018.10.1007/s13218-018-0565-5Search in Google Scholar
25. Johannes Wagner, Dominik Schiller, Andreas Seiderer, and Elisabeth André. Deep learning in paralinguistic recognition tasks: Are hand-crafted features still relevant? In Proceedings of Interspeech 2018, pages 147–151, 2018.10.21437/Interspeech.2018-1238Search in Google Scholar
26. Steffen Walter, Sascha Gruss, Hagen Ehleiter, Junwen Tan, Harald C Traue, Philipp Werner, Ayoub Al-Hamadi, Stephen Crawcour, Adriano O Andrade, and Gustavo Moreira da Silva. The biovid heat pain database data for the advancement and systematic validation of an automated pain recognition system. In Proceedings of the International Conference on Cybernetics, pages 128–131. IEEE, 2013.10.1109/CYBConf.2013.6617456Search in Google Scholar
27. Philipp Werner, Ayoub Al-Hamadi, Kerstin Limbrecht-Ecklundt, Steffen Walter, Sascha Gruss, and Harald C Traue. Automatic pain assessment with facial activity descriptors. IEEE Transactions on Affective Computing, 8(3):286–299, 2017.10.1109/TAFFC.2016.2537327Search in Google Scholar
© 2019 Walter de Gruyter GmbH, Berlin/Boston