Abstract
With the advancement of the voice signal processing, speech to text recognition has become an important area of research. Though some efforts are found for the English language, for regional languages like Bengali, Hindi, Guajarati etc. it is very rare or not started yet. Thus objectives of this work is to develop a method to identify isolated Bengali letter/alphabet (Swarabarna and Banjanbarna), from uttered sound. In speech processing, identifying a particular uttered letter consists of two major steps, Speech Feature Extraction and Feature Classification. Studies show that Mel Frequency Cepstral Coefficient (MFCC) give better representation of human auditory system, but at the same time with increased noise, performance of MFCC degrades, which may be reduced by Discrete Wavelet Transform (DWT). Thus MFCC combined with DWT is used as a feature termed as Mel Frequency Wavelet Transform Coefficient (MFWTC) for this work. For experiment, a sound database is developed by uttering of 43 Bengali alphabets {11 Swarabarna and 32 Banjanbarna} by ten speakers, 20 times for each letter. Then these signals are pre-processed to remove the silent portion from both end points followed by applying pre-emphasized filter. Next, MFCC features are extracted from preprocessed signals. These features are then fine-tuned by applying DWT to compute MFWTC features. Not only these feature, Zero Crossing Count(ZCC) are also used independently to compare with this method. Finally these features are used to recognize the Bengali Barnas using different classifiers (BayesNet, NaiveBayes, IB1, LWL, Classification Via Clustering, Dagging, Multi Scheme, VFI, Conjunctive Rule, ZeroR, BFTree and Simple Cart) available in Weka tools. The classification accuracy is measured using 10-fold cross validation method, which shows the average 47.61% and 62.19% for Swarabarna and Banjanbarna respectively.
Similar content being viewed by others
References
Chauhan, P.M., Desai, N.P.: Mel Frequency Cepstral Coefficients (MFCC) based speaker identification in noisy environment using wiener filter. In: International Conference on Green Computing Communication and Electrical Engineering (ICGCCEE), 2014, pp. 1–5 (2014)
Manikandan, J., Venkataramani, B., Preeti, P., Sananda, G., Sadhana, K.V.: Implementation of a phoneme recognition system using zero-crossing and magnitude sum function. In: TENCON 2009-2009 IEEE Region 10 Conference, pp. 1–5 (2009)
Hao, Y., Xiaoyan, Z.: A new feature in speech recognition based on wavelet transform. In: 5th International Conference on Signal Processing Proceedings, WCCC-ICSP 2000, vol. 3, pp. 1526–1529 (2000)
Nigade, A.S., Chitode, J.S.: Throat microphone signals for isolated word recognition using LPC. Int. J. Adv. Res. Comput. Sci. Softw. Eng. 401–407 (2012)
Jiang, Z., Huang, H., Yang, S., Lu, S., Hao, Z.: Acoustic feature comparison of MFCC and CZT-based cepstrum for speech recognition. In: Fifth International Conference on Natural Computation, ICNC 2009, vol. 1, pp. 55–59 (2009)
Devi, M.R., Ravichandran, T.: A novel approach for speech feature extraction by cubic-log compression in MFCC. In: 2013 International Conference on Pattern Recognition, Informatics and Mobile Engineering (PRIME), pp. 182–186 (2013)
Chee, L.S., Ai, O.C., Hariharan, M., Yaacob, S.: MFCC based recognition of repetitions and prolongations in stuttered speech using k-NN and LDA. In: 2009 IEEE Student Conference on Research and Development (SCOReD), pp. 146–149 (2009)
Shafik, A., Elhalafawy, S.M., Diab, S.M., Sallam, B.M., Abd El-samie, F.E.: A wavelet based approach for speaker identification from degraded speech. Int. J. Commun. Netw. Inf. Secur. (IJCNIS) 1(3) (2009)
Abdalla, M.I., Ali, H.S.: Wavelet-based Mel-frequency cepstral coefficients for speaker identification using hidden Markov models. arXiv preprint arXiv:1003.5627, vol. 1, pp. 16–21 (2010)
Abdalla, M.I., Abobakr, H.M., Gaafar, T.S.: DWT and MFCCs based feature extraction methods for isolated word recognition. Int. J. Comput. Appl. 69(20), 21–25 (2013)
Modic, R., Lindberg, B., Petek, B.: Comparative wavelet and MFCC speech recognition experiments on the Slovenian and English speechDat2. In: ISCA Tutorial and Research Workshop on Non-linear Speech Processing (2003)
Deshpande, M.S., Holambe, R.S.: Speaker identification using admissible wavelet packet based decomposition. Int. J. Sign. Process. 6(1), 20–23 (2010)
Liw, S.H., Thang, K.F.: Development of intelligent speech-recognition system using wavelet transform and neural network. In: The Second International Conference on Technological Advances in Electrical, Electronics and Computer Engineering (TAEECE 2014), pp. 72–77 (2014)
Farooq, O., Datta, S.: Mel filter-like admissible wavelet packet structure for speech recognition. IEEE Sig. Process. Lett. 8(7), 196–198 (2001)
Gaikwad, S., Gawali, B., Yannawar, P., Mehrotra, S.: Feature extraction using fusion MFCC for continuous Marathi speech recognition. In: 2011 Annual IEEE India Conference (INDICON), pp. 1–5 (2011)
Ali, M.A., Hossain, M., Nuruzzaman Bhuiyan, M.: Automatic speech recognition technique for Bangla words. Int. J. Adv. Sci. Technol. 50, 51–60 (2013)
Muhammad, G., Alotaibi, Y.A., Nurul Huda, M.: Automatic speech recognition for Bangla digits. In: 12th International Conference on Computers and Information Technology, ICCIT 2009, pp. 379–383 (2009)
Das, B., Mandal, S., Mitra, P.: Bengali speech corpus for continuous automatic speech recognition system. In: International Conference on Speech Database and Assessments (Oriental COCOSDA), pp. 51–55 (2011)
Manjunath, K.E., Rao, K.S., Pati, D.: Development of phonetic engine for Indian languages: Bengali and Oriya. In: Oriental COCOSDA Held Jointly with 2013 Conference on Asian Spoken Language Research and Evaluation (O-COCOSDA/CASLRE), pp. 1–6 (2013)
Ghanty, S.K., Shaikh, S.H., Chaki, N.: On recognition of spoken Bengali numerals. In: Computer Information Systems and Industrial Management Applications (CISIM), pp. 54–59 (2010)
Podder, P., Zaman Khan, T., Khan, M.H., Muktadir Rahman, M.: Comparative performance analysis of Hamming, Hanning and Blackman Window. Int. J. Comput. Appl. 96(18), 1–7 (2014)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Phadikar, S., Das, P., Bhakta, I., Roy, A., Midya, S., Majumder, K. (2017). Bengali Phonetics Identification Using Wavelet Based Signal Feature. In: Mandal, J., Dutta, P., Mukhopadhyay, S. (eds) Computational Intelligence, Communications, and Business Analytics. CICBA 2017. Communications in Computer and Information Science, vol 775. Springer, Singapore. https://doi.org/10.1007/978-981-10-6427-2_21
Download citation
DOI: https://doi.org/10.1007/978-981-10-6427-2_21
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-6426-5
Online ISBN: 978-981-10-6427-2
eBook Packages: Computer ScienceComputer Science (R0)