Abstract
Technology has made a paramount impact in our daily life over the last decade by assisting us in ways more than we could have imagined of in the last century. Safeguarding our identity in the digital world has been one of the primary concerns in this era and scientists have devoted their attention to biometric security for the same due to its array of advantages. Humans can be identified using a lot of biometrics and voice is one of them. SISU (Speaker Identification from Short Utterances) is a system proposed towards identification of humans from voice clips of very short length. The system works by Mel Frequency Cepstral Coefficient (MFCC) based features. The system was tested on a short utterance phoneme database of 3290 clips and a highest accuracy of 96.66% was obtained using Random Forest amidst different classifiers for the system.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Jain, A.K., Ross, A., Prabhakar, S.: An introduction to biometric recognition. IEEE Trans. Circuits Syst. Video Technol. 14(1), 4–20 (2004)
Chakroun, R., Zouari, L.B., Frikha, M., Hamida, A.B.: A novel approach based on support vector machines for automatic speaker identification. In: AICCSA-2015. IEEE (2015)
Chaudhari, A., Rahulkar, A., Dhonde, S.B.: Combining dynamic features with MFCC for text-independent speaker identification. In: ICIP-2015, pp. 160–164. IEEE (2015)
Tsai, W.H., Lin, J.C., Ma, C.H., Liao, Y.F.: Speaker identification for personalized smart TVs. In: ICCE-TW-2016, pp. 1–2. IEEE (2016)
Indumathi, A., Chandra, E.: Speaker identification using bagging techniques. In: ICCCS-2015, pp. 223–229. IEEE (2015)
Chakroun, R., Zouari, L.B., Frikha, M., Hamida, A.B.: A hybrid system based on GMM-SVM for speaker identification. In: ISDA-2015, pp. 654–658. IEEE (2015)
Lei, L., She, K.: Speaker identification using wavelet Shannon entropy and probabilistic neural network. In: ICNC-FSKD-2016, pp. 566–571. IEEE (2016)
Sardar, V.M., Shrbahadurkar, S.D.: Speaker identification with whispered speech mode using MFCC: challenges to whispered speech identification. In: ICIP-2015, pp. 70–74. IEEE (2015)
Ma, Z., Yu, H., Tan, Z.H., Guo, J.: Text-independent speaker identification using the histogram transform model. IEEE Access 4, 9733–9739 (2016)
Lin, W.: An improved GMM-based clustering algorithm for efficient speaker identification. In: ICCSNT-2015, vol. 1, pp. 1490–1493. IEEE (2015)
Biagetti, G., Crippa, P., Falaschetti, L., Orcioni, S., Turchetti, C.: An investigation on the accuracy of truncated DKLT representation for speaker identification with short sequences of speech frames. IEEE Trans. Cybern. 47(12), 4235–4249 (2017)
Shafee, S., Anuradha, B.: Speaker identification and Spoken word recognition in noisy background using artificial neural networks. In: ICEEOT-2016, pp. 912–917. IEEE (2016)
Al-Kaltakchi, M.T., Woo, W.L., Dlay, S.S., Chambers, J.A.: Study of statistical robust closed set speaker identification with feature and score-based fusion. In: SSP-2016, pp. 1–5. IEEE (2016)
AboElenein, N.M., Amin, K.M., Ibrahim, M., Hadhoud, M.M.: Improved text-independent speaker identification system for real time applications. In: JEC-ECC-2016, pp. 58–62. IEEE (2016)
Lewis, M.P., Simons, G.F., Fennig, C.D.: Ethnologue: Languages of the World, vol. 16. SIL International, Dallas (2009)
Mukherjee, H., Halder, C., Phadikar, S., Roy, K.: READ—a Bangla phoneme recognition system. In: Satapathy, S., Bhateja, V., Udgata, S., Pattnaik, P. (eds.) AISC, vol. 515, pp. 599–607. Springer, Singapore (2017). https://doi.org/10.1007/978-981-10-3153-3_59
Mukherjee, H., Halder, C., Phadikar, S., Roy, K.: READ—a Bangla phoneme recognition system. In: Satapathy, S.C., Bhateja, V., Udgata, S.K., Pattnaik, P.K. (eds.) Proceedings of the 5th International Conference on Frontiers in Intelligent Computing: Theory and Applications. AISC, vol. 515, pp. 599–607. Springer, Singapore (2017). https://doi.org/10.1007/978-981-10-3153-3_59
Mukherjee, H., Dhar, A., Phadikar, S., Roy, K.: RECAL-a language identification system. In: ICSPC-2017, pp. 300–304. IEEE (2017)
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)
John, G.H., Langley, P.: Estimating continuous distributions in Bayesian classifiers. In: UAI-1995, pp. 338–345. Morgan Kaufmann Publishers Inc., August 1995
Cristianini, N., Shawe-Taylor, J.: An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods. Cambridge University Press, Cambridge (2000)
Obaidullah, S.M., Mondal, A., Das, N., Roy, K.: Script identification from printed Indian document images and performance evaluation using different classifiers. Appl. Comput. Intell. Soft Comput. 2014, 22 (2014)
Fan, R.E., Chang, K.W., Hsieh, C.J., Wang, X.R., Lin, C.J.: LIBLINEAR: a library for large linear classification. J. Mach. Learn. Res. 9(Aug), 1871–1874 (2008)
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. ACM SIGKDD Explor. Newsl. 11(1), 10–18 (2009)
Dems̆ar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)
Vajda, S., Santosh, K.C.: A fast k-nearest neighbor classifier using unsupervised clustering. In: Santosh, K.C., Hangarge, M., Bevilacqua, V., Negi, A. (eds.) RTIP2R 2016. CCIS, vol. 709, pp. 185–193. Springer, Singapore (2017). https://doi.org/10.1007/978-981-10-4859-3_17
Bouguelia, M.R., Nowaczyk, S., Santosh, K.C., Verikas, A.: Agreeing to disagree: active learning with noisy labels without crowdsourcing. Int. J. Mach. Learn. Cybern. 9, 1–13 (2017)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Mukherjee, H., Dutta, M., Obaidullah, S.M., Santosh, K.C., Phadikar, S., Roy, K. (2019). SISU - A Speaker Identification System from Short Utterances. In: Santosh, K., Hegadi, R. (eds) Recent Trends in Image Processing and Pattern Recognition. RTIP2R 2018. Communications in Computer and Information Science, vol 1035. Springer, Singapore. https://doi.org/10.1007/978-981-13-9181-1_39
Download citation
DOI: https://doi.org/10.1007/978-981-13-9181-1_39
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-9180-4
Online ISBN: 978-981-13-9181-1
eBook Packages: Computer ScienceComputer Science (R0)