Sentiment analysis with word-based Urdu speech recognition

534 Accesses
5 Citations
Explore all metrics

Abstract

Urdu is one of the popular languages across the world as approximately 70 million people speak Urdu in their day-to-day conversations. In general, Muslims prefer to share their opinion or feedback in speech format in the Urdu language. From the literature, it is evident that opinion extraction from naturalistic audio has emerged as a new field of research. In this automatic speech, recognition is carried with keyword spotting approaches on audio, and then opinion score is computed. In this paper, the authors propose a novel framework for the extraction of sentiment from Urdu audio data. Firstly, speech utterances are duly pre-processed, and then short-term features such as Mel-frequency cepstral coefficients, spectral energy, Chroma vector features, perceptual linear prediction (PLP) cepstral coefficients and relative-spectral PLP features are extracted. Five mid-term features, including mean, median, etc., are then derived from those short-term features. In the opinion extraction phase, midterm features of Urdu test utterances are compared with the midterm features of the dictionary of words to cite the opinion as positive, negative, and neutral. The originality of the work involves analyzing the perceptual features to find out the features that contain significant information to extract sentiment in Urdu utterances. In this work, weight mean vector fusion technique is used to fuse the outputs of hidden Markov model and dynamic time warping. In the experiments, 97.1% accuracy is achieved in the sentiment analysis task on the Urdu custom corpus of 600 utterances, which outperforms other state-of-the-art classifiers.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Emotion Recognition Using Short Time Speech Analysis

Speech emotion recognition for human–computer interaction

Article 31 August 2024

Improved Speech Emotion Classification Using Deep Neural Network

Article 29 July 2023

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Abburi H, Shrivastava M, Gangashetty SV (2017) Improved multimodal sentiment detection using stressed regions of audio. In: IEEE Reg. 10 annu. int. conf. proceedings/TENCON, pp 2834–2837
Ali H (2015) Automatic speech recognition of Urdu digits with optimal classification approach. Int J Comput Appl 118(9):1–5
Google Scholar
Ali H, Ahmad N, Zhou X, Iqbal K, Ali SM (2014) DWT features performance analysis for automatic speech recognition of Urdu. Springerplus 3(1):204
Article Google Scholar
Ali H, Ahmad N, Zhou X (2015) Automatic speech recognition of Urdu words using linear discriminant analysis. J Intell Fuzzy Syst 28(5):2369–2375
Article Google Scholar
Amiriparian S, Cummins N, Ottl S, Gerczuk M, Schuller B (2018) Sentiment analysis using image-based deep spectrum features. In: 2017 7th int. conf. affect. comput. intell. interact. work. demos, ACIIW 2017, vol 2018 January, pp 26–29
Arora M, Kansal V (2019) Character level embedding with deep convolutional neural network for text normalization of unstructured data for Twitter sentiment analysis. Soc Netw Anal Min. https://doi.org/10.1007/s13278-019-0557-y
Article Google Scholar
Augustyniak L, Kajdanowicz T, Szyma P, Tuligłowicz W (2014) Simpler is better ? Lexicon-based ensemble sentiment classification beats supervised methods. In: 2014 IEEE/ACM int. conf. adv. soc. networks anal. min. (ASONAM 2014), no. Asonam, pp 924–929
Barbosa L (2010) Robust sentiment detection on Twitter from biased and noisy data. In: Proc. int. conf. comput. linguistics (COLING-2010) 2010, pp 36–44
Cambria E, Howard N (2016) Computational intelligence for big social data analysis [Guest Editorial]. Knowl Based Syst 108(C):1–4
Article Google Scholar
Davis S, Mermelstein P (1980) Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Trans Acoust Speech Signal Process 28(4):357–366
Article Google Scholar
Dellarocas C (2006) Strategic manipulation of internet opinion forums: implications for consumers and firms. Manage Sci 52(10):1577–1593
Article Google Scholar
Ding IJ, Hsu YM (2014) An HMM-like dynamic time warping scheme for automatic speech recognition. Math Probl Eng 2014:1–8
Google Scholar
Elharati HA, Alshaari M, Këpuska VZ (2020) Arabic Speech Recognition System Based on MFCC and HMMs. J Comput Commun 08(03):28–34
Article Google Scholar
Erman LD (1974) An environment and system for machine understanding of connected speech. Ph.D. thesis, Stanford, CA, USA, 1974 (AAI7427012)
Ezzat S, El Gayar N, Ghanem MM (2012) Sentiment analysis of call center audio conversations using text classification. Int J Comput Inf Syst Ind Manage Appl 4:619–627
Google Scholar
Feldman R (2013) Techniques and applications for sentiment analysis. Commun ACM 56(4):82
Article Google Scholar
Giannakopoulos T, Pikrakis A, Theodoridis S (2007) A multi-class audio classification method with respect to violent content in movies, using Bayesian networks. In: IEEE international workshop on multimedia, signal processing, MMSP07, 2007
Giannakopoulos T, Pikrakis A, Theodoridis S (2008) Gunshot detection in audio streams from movies by means of dynamic programming and bayesian networks. In: 33rd international conference on acoustics, speech, and signal processing, ICASSP08, 2008
Giannakopoulos T, Theodoros, Pikrakis A (2014) Introduction to audio analysis: a MATLAB approach, Academic Press. https://doi.org/10.1016/C2012-0-03524-7
Hossain M, Bhuiyan MN, Engineer S (2013) automatic speech recognition technique for Bangla words. Int J Adv Sci Technol 50:51–60
Google Scholar
http://www.adeveloperdiary.com/data-science/machine-learning/introduction-to-hidden-markov-model/
Kang X, Ren F, Member S, Wu Y (2018) Exploring latent semantic information for textual emotion recognition in blog articles. IEEE/CAA J Automatica Sinica 5(1):204–216
Article Google Scholar
Kaushik L, Sangwan A, Hansen JHL (2017) Automatic sentiment detection in naturalistic audio. IEEE/ACM Trans Audio Speech Lang Process 25(8):1668–1679
Article Google Scholar
Kim H-G, Moreau N, Sikora T (2005) MPEG-7 audio and beyond: audio content indexing and retrieval. Wiley
Book Google Scholar
Lamba M, Madhusudhan M (2018) Application of sentiment analysis in libraries to provide temporal information service: a case study on various facets of productivity. Soc Netw Anal Min 8:63. https://doi.org/10.1007/s13278-018-0541-y
Article Google Scholar
Liu B (2012) Sentiment analysis and opinion mining. Lang. Arts Discip., Morgan & Claypool Publishers, p 167
Liu L, Member S, Pottim KR, Kuo SM, Member S (2019) Ear field adaptive noise control for snoring : an real-time experimental approach. IEEE/CAA J Automatica Sinica 6(1):158–166
Article MathSciNet Google Scholar
Londhe ND, Kshirsagar GB (2018) Chhattisgarhi, “Speech corpus for research and development in automatic speech recognition.” Int J Speech Technol 21:193
Article Google Scholar
Maghilnan S, Kumar MR, S. IEEE (2017) Sentiment analysis on speaker specific speech data. In: 2017 international conference on intelligent computing and control (I2C2)
Mairesse F, Polifroni J, Di Fabbrizio G (2012) Can prosody inform sentiment analysis? Experiments on short spoken reviews. In: ICASSP, IEEE int. conf. acoust. speech signal process.—proc., pp 5093–5096, 2012
Mishne G, Glance NS (2006) Predicting movie sales from blogger sentiment. In: Proc. AAAI 2006 Spring Symp. comput. approaches anal. weblogs, 2006, pp 155–159
Misra H, Ikbal S, Bourlard H, Hermansky H (2004) Spectral entropy-based feature for robust ASR. In: Proceedings of the 2004 IEEE international conference on acoustics, speech, and signal processing, ICASSP'04, vol 1, IEEE, 2004, pp I–193
Mitilineos SA, Tatlas NA, Potirakis SM, Rangoussi M (2019) Neural network fusion and selection techniques for noise-efficient sound classification. AES J Audio Eng Soc 67(1):27–37
Article Google Scholar
Pikrakis A, Giannakopoulos T, Theodoridis S (2008) A speech/music discriminator of radio recordings based on dynamic programming and bayesian networks. IEEE Trans Multimed 10(5):846–857
Article Google Scholar
Rahmani A, Chen A, Sarhan A, Jida J, Rifaie M, Alhajj R (2014) Social media analysis and summarization for opinion mining: a business case study. Soc Netw Anal Min. https://doi.org/10.1007/s13278-014-0171-y
Article Google Scholar
Shaikh Naziya S, Deshmukh RR (2017) LPC and HMM performance analysis for speech recognition system for Urdu digits. IOSR J 19(4):14–18
Article Google Scholar
Slaney M (1998) Auditory toolbox, version 2. Technical Report, Interval Research Corporation, 1998
Theodoridis S, Koutroumbas K (2008) Pattern recognition, 4th edn. Academic Press, Inc.
MATH Google Scholar
Wakefield GH (1999) Mathematical representation of joint time-Chroma distributions. In: SPIE's international symposium on optical science, engineering, and instrumentation. International Society for Optics and Photonics, pp 637–645
Xie L, Liu Z-Q (2006) A comparative study of audio features for audio to visual conversion in MPEG-4 compliant facial animation. In: Proc. of ICMLC, Dalian, 13–16 Aug-2006

Download references

Author information

Authors and Affiliations

Department of Computer Science & Engineering, Vignan’s Foundation for Science, Technology & Research, Vadlamudi, Guntur, 522213, Andhra Pradesh, India
Riyaz Shaik & S. Venkatramaphanikumar

Authors

Riyaz Shaik
View author publications
You can also search for this author in PubMed Google Scholar
S. Venkatramaphanikumar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to S. Venkatramaphanikumar.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Shaik, R., Venkatramaphanikumar, S. Sentiment analysis with word-based Urdu speech recognition. J Ambient Intell Human Comput 13, 2511–2531 (2022). https://doi.org/10.1007/s12652-021-03460-x

Download citation

Received: 17 March 2020
Accepted: 31 August 2021
Published: 30 September 2021
Issue Date: May 2022
DOI: https://doi.org/10.1007/s12652-021-03460-x

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Emotion Recognition Using Short Time Speech Analysis

Speech emotion recognition for human–computer interaction

Improved Speech Emotion Classification Using Deep Neural Network

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Sentiment analysis with word-based Urdu speech recognition

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Emotion Recognition Using Short Time Speech Analysis

Speech emotion recognition for human–computer interaction

Improved Speech Emotion Classification Using Deep Neural Network

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation