SISU - A Speaker Identification System from Short Utterances

Himadri Mukherjee⁹,
Moumita Dutta⁹,
Sk Md. Obaidullah¹⁰,
K. C. Santosh¹¹,
Santanu Phadikar¹² &
…
Kaushik Roy⁹

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1035))

Included in the following conference series:

International Conference on Recent Trends in Image Processing and Pattern Recognition

696 Accesses

Abstract

Technology has made a paramount impact in our daily life over the last decade by assisting us in ways more than we could have imagined of in the last century. Safeguarding our identity in the digital world has been one of the primary concerns in this era and scientists have devoted their attention to biometric security for the same due to its array of advantages. Humans can be identified using a lot of biometrics and voice is one of them. SISU (Speaker Identification from Short Utterances) is a system proposed towards identification of humans from voice clips of very short length. The system works by Mel Frequency Cepstral Coefficient (MFCC) based features. The system was tested on a short utterance phoneme database of 3290 clips and a highest accuracy of 96.66% was obtained using Random Forest amidst different classifiers for the system.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

A Comparison Between MFCC and MSE Features for Text-Independent Speaker Recognition Using Machine Learning Algorithms

Efficient text-independent speaker recognition with short utterances in both clean and uncontrolled environments

Article 03 May 2020

Automatic Speaker Recognition Using Hybrid Parameters Based on Machine Learning Applied on Two Dataset

References

Jain, A.K., Ross, A., Prabhakar, S.: An introduction to biometric recognition. IEEE Trans. Circuits Syst. Video Technol. 14(1), 4–20 (2004)
Article Google Scholar
Chakroun, R., Zouari, L.B., Frikha, M., Hamida, A.B.: A novel approach based on support vector machines for automatic speaker identification. In: AICCSA-2015. IEEE (2015)
Google Scholar
Chaudhari, A., Rahulkar, A., Dhonde, S.B.: Combining dynamic features with MFCC for text-independent speaker identification. In: ICIP-2015, pp. 160–164. IEEE (2015)
Google Scholar
Tsai, W.H., Lin, J.C., Ma, C.H., Liao, Y.F.: Speaker identification for personalized smart TVs. In: ICCE-TW-2016, pp. 1–2. IEEE (2016)
Google Scholar
Indumathi, A., Chandra, E.: Speaker identification using bagging techniques. In: ICCCS-2015, pp. 223–229. IEEE (2015)
Google Scholar
Chakroun, R., Zouari, L.B., Frikha, M., Hamida, A.B.: A hybrid system based on GMM-SVM for speaker identification. In: ISDA-2015, pp. 654–658. IEEE (2015)
Google Scholar
Lei, L., She, K.: Speaker identification using wavelet Shannon entropy and probabilistic neural network. In: ICNC-FSKD-2016, pp. 566–571. IEEE (2016)
Google Scholar
Sardar, V.M., Shrbahadurkar, S.D.: Speaker identification with whispered speech mode using MFCC: challenges to whispered speech identification. In: ICIP-2015, pp. 70–74. IEEE (2015)
Google Scholar
Ma, Z., Yu, H., Tan, Z.H., Guo, J.: Text-independent speaker identification using the histogram transform model. IEEE Access 4, 9733–9739 (2016)
Article Google Scholar
Lin, W.: An improved GMM-based clustering algorithm for efficient speaker identification. In: ICCSNT-2015, vol. 1, pp. 1490–1493. IEEE (2015)
Google Scholar
Biagetti, G., Crippa, P., Falaschetti, L., Orcioni, S., Turchetti, C.: An investigation on the accuracy of truncated DKLT representation for speaker identification with short sequences of speech frames. IEEE Trans. Cybern. 47(12), 4235–4249 (2017)
Article Google Scholar
Shafee, S., Anuradha, B.: Speaker identification and Spoken word recognition in noisy background using artificial neural networks. In: ICEEOT-2016, pp. 912–917. IEEE (2016)
Google Scholar
Al-Kaltakchi, M.T., Woo, W.L., Dlay, S.S., Chambers, J.A.: Study of statistical robust closed set speaker identification with feature and score-based fusion. In: SSP-2016, pp. 1–5. IEEE (2016)
Google Scholar
AboElenein, N.M., Amin, K.M., Ibrahim, M., Hadhoud, M.M.: Improved text-independent speaker identification system for real time applications. In: JEC-ECC-2016, pp. 58–62. IEEE (2016)
Google Scholar
Lewis, M.P., Simons, G.F., Fennig, C.D.: Ethnologue: Languages of the World, vol. 16. SIL International, Dallas (2009)
Google Scholar
Mukherjee, H., Halder, C., Phadikar, S., Roy, K.: READ—a Bangla phoneme recognition system. In: Satapathy, S., Bhateja, V., Udgata, S., Pattnaik, P. (eds.) AISC, vol. 515, pp. 599–607. Springer, Singapore (2017). https://doi.org/10.1007/978-981-10-3153-3_59
Chapter Google Scholar
Mukherjee, H., Halder, C., Phadikar, S., Roy, K.: READ—a Bangla phoneme recognition system. In: Satapathy, S.C., Bhateja, V., Udgata, S.K., Pattnaik, P.K. (eds.) Proceedings of the 5th International Conference on Frontiers in Intelligent Computing: Theory and Applications. AISC, vol. 515, pp. 599–607. Springer, Singapore (2017). https://doi.org/10.1007/978-981-10-3153-3_59
Chapter Google Scholar
Mukherjee, H., Dhar, A., Phadikar, S., Roy, K.: RECAL-a language identification system. In: ICSPC-2017, pp. 300–304. IEEE (2017)
Google Scholar
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)
Article Google Scholar
John, G.H., Langley, P.: Estimating continuous distributions in Bayesian classifiers. In: UAI-1995, pp. 338–345. Morgan Kaufmann Publishers Inc., August 1995
Google Scholar
Cristianini, N., Shawe-Taylor, J.: An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods. Cambridge University Press, Cambridge (2000)
Book Google Scholar
Obaidullah, S.M., Mondal, A., Das, N., Roy, K.: Script identification from printed Indian document images and performance evaluation using different classifiers. Appl. Comput. Intell. Soft Comput. 2014, 22 (2014)
Article Google Scholar
Fan, R.E., Chang, K.W., Hsieh, C.J., Wang, X.R., Lin, C.J.: LIBLINEAR: a library for large linear classification. J. Mach. Learn. Res. 9(Aug), 1871–1874 (2008)
MATH Google Scholar
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. ACM SIGKDD Explor. Newsl. 11(1), 10–18 (2009)
Article Google Scholar
Dems̆ar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)
MathSciNet Google Scholar
Vajda, S., Santosh, K.C.: A fast k-nearest neighbor classifier using unsupervised clustering. In: Santosh, K.C., Hangarge, M., Bevilacqua, V., Negi, A. (eds.) RTIP2R 2016. CCIS, vol. 709, pp. 185–193. Springer, Singapore (2017). https://doi.org/10.1007/978-981-10-4859-3_17
Chapter Google Scholar
Bouguelia, M.R., Nowaczyk, S., Santosh, K.C., Verikas, A.: Agreeing to disagree: active learning with noisy labels without crowdsourcing. Int. J. Mach. Learn. Cybern. 9, 1–13 (2017)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, West Bengal State University, Kolkata, India
Himadri Mukherjee, Moumita Dutta & Kaushik Roy
Department of Computer Science and Engineering, Aliah University, Kolkata, India
Sk Md. Obaidullah
Department of Computer Science, The University of South Dakota, Vermillion, SD, USA
K. C. Santosh
Department of Computer Science and Engineering, Maulana Abul Kalam Azad University of Technology, Kolkata, India
Santanu Phadikar

Authors

Himadri Mukherjee
View author publications
You can also search for this author in PubMed Google Scholar
Moumita Dutta
View author publications
You can also search for this author in PubMed Google Scholar
Sk Md. Obaidullah
View author publications
You can also search for this author in PubMed Google Scholar
K. C. Santosh
View author publications
You can also search for this author in PubMed Google Scholar
Santanu Phadikar
View author publications
You can also search for this author in PubMed Google Scholar
Kaushik Roy
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Himadri Mukherjee .

Editor information

Editors and Affiliations

Department of Computer Science, University of South Dakota, Vermillion, SD, USA
K. C. Santosh
Solapur University, Solapur, India
Ravindra S. Hegadi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mukherjee, H., Dutta, M., Obaidullah, S.M., Santosh, K.C., Phadikar, S., Roy, K. (2019). SISU - A Speaker Identification System from Short Utterances. In: Santosh, K., Hegadi, R. (eds) Recent Trends in Image Processing and Pattern Recognition. RTIP2R 2018. Communications in Computer and Information Science, vol 1035. Springer, Singapore. https://doi.org/10.1007/978-981-13-9181-1_39

Download citation

DOI: https://doi.org/10.1007/978-981-13-9181-1_39
Published: 20 July 2019
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-9180-4
Online ISBN: 978-981-13-9181-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics