Abstract
Voice disorders are associated with irregular vibrations of vocal folds. Based on the source filter theory of speech production, these irregular vibrations can be detected in a non-invasive way by analyzing the speech signal. In this paper we present a multiband approach for the detection of voice disorders given that the voice source generally interacts with the vocal tract in a non-linear way. In normal phonation, and assuming sustained phonation of a vowel, the lower frequencies of speech are heavily source dependent due to the low frequency glottal formant, while the higher frequencies are less dependent on the source signal. During abnormal phonation, this is still a valid, but turbulent noise of source, because of the irregular vibration, affects also higher frequencies. Motivated by such a model, we suggest a multiband approach based on a three-level discrete wavelet transformation (DWT) and in each band the fractal dimension (FD) of the estimated power spectrum is estimated. The experiments suggest that frequency band 1–1562 Hz, lower frequencies after level 3, exhibits a significant difference in the spectrum of a normal and pathological subject. With this band, a detection rate of 91.28 % is obtained with one feature, and the obtained result is higher than all other frequency bands. Moreover, an accuracy of 92.45 % and an area under receiver operating characteristic curve (AUC) of 95.06 % is acquired when the FD of all levels is fused. Likewise, when the FD of all levels is combined with 22 Multi-Dimensional Voice Program (MDVP) parameters, an improvement of 2.26 % in accuracy and 1.45 % in AUC is observed.
Similar content being viewed by others
References
Mohan, B., Diseases of ear, nose and throat: Head and neck surgery, 1st edition. Jaypee Brothers Medical Publishers, New Delhi, India, 2013.
Hecker, M. H. L., and Kreul, E. J., Descriptions of the speech of patients with cancer of the vocal folds. part I: Measures of fundamental frequency. J. Acoust. Soc. Am. 49:1275–1282, 1971.
Muhammad, G., Mesallam, T. A., Malki, K. H., Farahat, M., Mahmood, A., and Alsulaiman, M., Multidirectional regression (MDR)-based features for automatic voice disorder detection. J. Voice 26:817 e19-27, 2012.
Baken, R. J., and Orlikoff, R., Clinical measurement of speech and voice, 2nd edition. Singular, San Diego, CA, 2000.
Lee, J. W., Kang, H. G., Choi, J. Y., and Son, Y. I., An investigation of vocal tract characteristics for acoustic discrimination of pathological voices. BioMed Res Int 2013:1–11, 2013.
Fontes, A. I. R., Souza, P. T. V., Neto, A. D. D., Martins, A. d. M., et al., Classification system of pathological voices using correntropy. Math. Probl. Eng. 2014:7, 2014.
Jung-Won, L., Kim, S., and Hong-Goo, K., Detecting pathological speech using contour modeling of harmonic-to-noise ratio, Proc. of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5969–5973, 2014.
Panek, D., Skalski, A., and Gajda, J., Quantification of linear and non-linear acoustic analysis applied to voice pathology detection, information technologies in biomedicine. Adv Intell Syst Comput 284:355–364, 2014.
Muhammad, G., and Melhem, M., Pathological voice detection and binary classification using MPEG-7 audio features. Biomed. Signal Proc. Control 11:1–9, 2014.
Muhammad, G., Ali, Z., Alsulaiman, M., and Al-Mutib K., Vocal fold disorder detection by applying LBP operator on dysphonic speech signal. Proc. Recent Adv. Intell. Control Model. Simul. pp. 222–228, 2014.
Lopes, R., and Betrouni, N., Fractal and multifractal analysis: A review. Med. Image Anal. 13:634–649, 2009.
Katz, M. J., Fractals and the analysis of waveforms. Comput. Biol. Med. 18:145–156, 1988.
Higuchi, T., Approach to an irregular time series on the basis of the fractal theory. Phys. D. Nonlinear Phenom. 31:277–283, 1988.
Petrosian, A., Kolmogorov complexity of finite sequences and recognition of different preictal EEG patterns, Proc. of the Eighth IEEE Symposium on Computer-Based Medical Systems, pp. 212–217, 1995.
Maragos, P., Fractal aspects of speech signals: Dimension and interpolation, Proc. of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 417–420, 1991.
Senevirathne, T. R., Bohez, E. L. J., and Van Winden, J. A., Amplitude scale method: New and efficient approach to measure fractal dimension of speech waveforms. Electron. Lett. 28:420–422, 1992.
Kim, Y. W., Krieble, K. K., Kim, C. B., Reed, J., and Rae-Grant, A. D., Differentiation of alpha coma from awake alpha by nonlinear dynamics of electroencephalography. Electroencephalogr. Clin. Neurophysiol. 98:35–41, 1996.
Mishra, A. K., and Raghav, S., Local fractal dimension based ECG arrhythmia classification. Biomed. Signal Proc. Control 5:114–123, 2010.
Esteller, R., Vachtsevanos, G., Echauz, J., and Litt, B., A comparison of waveform fractal dimension algorithms, circuits and systems I: Fundamental theory and applications. IEEE Trans. Circ. Syst. 48:177–183, 2001.
Raghavendra, B. S., and Narayana Dutt, D., A note on fractal dimensions of biomedical waveforms. Comput. Biol. Med. 39:1006–1012, 2009.
Baljekar, P. N., and Patil, H. A., A comparison of waveform fractal dimension techniques for voice pathology classification, Proc. of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4461–4464, 2012.
Accardo, A., Affinito, M., Carrozzi, M., and Bouquet, F., Use of the fractal dimension for the analysis of electroencephalographic time series. Biol. Cybern. 77:339–350, 1997.
Accardo, A., Fabbro, F., and Mumolo, E., Analysis of normal and pathological voices via short-time fractal dimension, Proc. of 14th Annual International Conference of the IEEE on Engineering in Medicine and Biology Society, pp. 1270–1271, 1992.
Massachusetts Eye & Ear Infirmary Voice & Speech LAB, Disordered voice database model 4337 (Ver. 1.03), ed. Boston, MA: Kay Elemetrics Corp, 1994.
Little, M., McSharry, P., Roberts, S., Costello, D., and Moroz, I., Exploiting nonlinear recurrence and fractal scaling properties for voice disorder detection. Biomed. Eng. OnLine 6:23, 2007.
Arjmandi, M. K., Pooyan, M., Mikaili, M., Vali, M., and Moqarehzadeh, A., Identification of voice disorders using long-time features and support vector machine with different feature reduction methods. J. Voice 25:e275–89, 2011.
Cortes, C., and Vapnik, V., Support-vector networks. Mach. Learn. 20:273–297, 1995.
Vaziri, G., and Almasganj, F., Pathological Assessment of vocal fold nodules and polyp via fractal dimension of patients’ voices, Proc. of the 2nd International Conference on Bioinformatics and Biomedical Engineering, pp. 2044–2047, 2008.
Farouk, M. H., Application of wavelets in speech processing: Springer, 2014.
Godino-Llorente, J. I., Gómez-Vilda, P., and Blanco-Velasco, M., Dimensionality reduction of a pathological voice quality assessment system based on gaussian mixture models and short-term cepstral parameters. IEEE Trans. Biomed. Eng. 53:1943–1953, 2006.
Markaki, M., and Stylianou, Y., Voice pathology detection and discrimination based on modulation spectral features. IEEE Trans. Audio Speech Lang. Process. 19:1938–1948, 2011.
Chang, C.-C., and Lin, C.-J., LIBSVM: A library for support vector machines. ACM Trans. Intell. Syst. Technol. 2:1–27, 2011.
Acknowledgments
This project was funded by the National Plan for Science, Technology, and Innovation (MAARIFAH), King Abdulaziz City for Science and Technology, Kingdom of Saudi Arabia, Award Number (12-MED-2474-02).
Author information
Authors and Affiliations
Corresponding author
Additional information
This article is part of the Topical Collection on Patient Facing Systems
Appendix
Appendix
The list of files that do not contain MDVP parameters in the MEEI database:
-
JFG08AN.RES
-
KXH30AN.RES
-
LES15AN.RES
-
TAB21AN.RES
-
WPB30AN.RES
Rights and permissions
About this article
Cite this article
Ali, Z., Elamvazuthi, I., Alsulaiman, M. et al. Detection of Voice Pathology using Fractal Dimension in a Multiresolution Analysis of Normal and Disordered Speech Signals. J Med Syst 40, 20 (2016). https://doi.org/10.1007/s10916-015-0392-2
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10916-015-0392-2