Abstract
The recent pandemic crisis combined with the explosive growth of Artificial Intellignence (AI) algorithms has highlighted the potential benefits of telemedicine for decentralised, accurate and automated clinical diagnoses. One of the most popular and essential diagnoses is the auscultation; it is non-invasive, real-time and very informative diagnoses for knowing the state of the respiratory system. To implement a possible automated auscultation analysis, the decision-making explanation of complex models (such as Deep Learning models) is crucial for trusted application in the clinical domain. In this context, we will analyse the behaviour of a Convolutional Neural Network (CNN) in classifying the largest publicly available database of respiratory sounds, originally compiled to support the scientific challenge organized at Int. Conf. on Biomedical Health Informatics (ICBHI17). It contains respiratory sounds (recorded with auscultation) of normal respiratory cycles, crackles, wheezes and both. To capture the phonetically important features of breath sounds, the Mel-Frequency Cepstrum (MFC) for short-term power spectrum representation was applied. The MFC allowed us to identify latent features without losing the temporal information so that we could easily identify the correspondence of the features to the starting sound. The MFCs were used as input to the proposed CNN who was able to classify the four above-mentioned respiratory classes with an accuracy of 72.8%. Despite interesting results, the main focus of the present study was to investigate how the CNN achieved this classification. The explainable Artificial Intelligence (xAI) technique of Gradient-weighted Class Activation Mapping (Grad-CAM) was applied. xAI made it possible to visually identify the most relevant areas, especially for the recognition of abnormal sounds, which is crucial for inspecting the correct learning of the CNN.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Bohadana, A., Izbicki, G., Kraman, S.S.: Fundamentals of lung auscultation. N. Engl. J. Med. 370(8), 744–751 (2014)
Pasterkamp, H., Kraman, S.S., Wodicka, G.R.: Respiratory sounds: advances beyond the stethoscope. Am. J. Respir. Crit. Care Med. 156(3), 974–987 (1997)
Roguin, A.: Rene theophile hyacinthe laënnec (1781–1826): the man behind the stethoscope. Clin. Med. Res. 4(3), 230–235 (2006)
Meslier, N., Charbonneau, G., Racineux, J.: Wheezes. Eur. Respir. J. 8(11), 1942–1948 (1995)
Piirila, P., Sovijarvi, A.: Crackles: recording, analysis and clinical significance. Eur. Respir. J. 8(12), 2139–2148 (1995)
Acharya, J., Basu, A.: Deep neural network for respiratory sound classification in wearable devices enabled by patient specific model tuning. IEEE Trans. Biomed. Circuits Syst. 14(3), 535–544 (2020)
Bardou, D., Zhang, K., Ahmad, S.M.: Lung sounds classification using convolutional neural networks. Artif. Intell. Med. 88, 58–69 (2018)
Kim, Y., et al.: Respiratory sound classification for crackles, wheezes, and rhonchi in the clinical field using deep learning. Sci. Rep. 11(1), 1–11 (2021)
Rocha, B.M., et al.: A respiratory sound database for the development of automated classification. In: Maglaveras, N., Chouvarda, I., de Carvalho, P. (eds.) Precision Medicine Powered by pHealth and Connected Health. IP, vol. 66, pp. 33–37. Springer, Singapore (2018). https://doi.org/10.1007/978-981-10-7419-6_6
Logan, B.: Mel frequency cepstral coefficients for music modeling. In: International Symposium on Music Information Retrieval. Citeseer (2000)
McFee, B., et al.: Librosa: audio and music signal analysis in python. In: Proceedings of the 14th Python in Science Conference, vol. 8, pp. 18–25. Citeseer (2015)
Alake, R.: Understanding parameter sharing (or weights replication) within convolutional neural networks. https://towardsdatascience.com/understanding-parameter-sharing-or-weights-replication-within-convolutional-neural-networks-cc26db7b645a. Accessed 14 July 2022
CS231n: Convolutional neural networks for visual recognition course website tab. https://cs231n.github.io/convolutional-networks/. Accessed 14 July 2022
Albawi, S., Mohammed, T.A., Al-Zawi, S.: Understanding of a convolutional neural network. In: 2017 International Conference on Engineering and Technology (ICET), pp. 1–6. IEEE (2017)
Zeiler, M.D., et al.: On rectified linear units for speech processing. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 3517–3521 (2013)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 25, 1097–1105 (2012)
Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., Batra, D.: Grad-cam: visual explanations from deep networks via gradient-based localization. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 618–626 (2017)
Morabito, F.C., Ieracitano, C., Mammone, N.: An explainable artificial intelligence approach to study MCI to AD conversion via HD-EEG processing. Clin. EEG Neurosci. 15500594211063662 (2021)
Sarkar, M., Madabhavi, I., Niranjan, N., Dogra, M.: Auscultation of the respiratory system. Ann. Thorac. Med. 10(3), 158 (2015)
Swarup, S., Makaryus, A.N.: Digital stethoscope: technology update. Med. Devices (Auckland, NZ) 11, 29 (2018)
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)
Holzinger, A., Langs, G., Denk, H., Zatloukal, K., Müller, H.: Causability and explainability of artificial intelligence in medicine. Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 9(4), e1312 (2019)
Acknowledgments
This work was supported in part by “iCARE” project (CUP J39J14001400007) - action 10.5.12 - funded within POR FESR FSE 2014/2020 of Calabria Region with the participation of European Community Resources of FESR and FSE, of Italy and of Calabria.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Lo Giudice, M., Mammone, N., Ieracitano, C., Aguglia, U., Mandic, D., Morabito, F.C. (2022). Explainable Deep Learning Classification of Respiratory Sound for Telemedicine Applications. In: Mahmud, M., Ieracitano, C., Kaiser, M.S., Mammone, N., Morabito, F.C. (eds) Applied Intelligence and Informatics. AII 2022. Communications in Computer and Information Science, vol 1724. Springer, Cham. https://doi.org/10.1007/978-3-031-24801-6_28
Download citation
DOI: https://doi.org/10.1007/978-3-031-24801-6_28
Publisher Name: Springer, Cham
Online ISBN: 978-3-031-24801-6
eBook Packages: Computer ScienceComputer Science (R0)