default search action
International Journal of Speech Technology, Volume 26
Volume 26, Number 1, March 2023
Special Section on 'Advanced Neural Approaches and NLP in Speech Technology'
- Iwin Thanakumar Joseph Swamidason, Sravanthi Tatiparthi, V. M. Arul Xavier, C. S. C. Devadass:
Exploration of diverse intelligent approaches in speech recognition systems. 1-10 - Yuanyuan Wang, Xinliang Huang, Bingqing Li, Xiaoqing Liu, Yingying Ma, Xinjing Huang:
Spreading mechanism of Weibo public opinion phonetic representation based on the epidemic model. 11-21 - Satti R. G. Reddy, G. P. Saradhi Varma, Rajya Lakshmi Davuluri:
Optimized convolutional neural network model for plant species identification from leaf images using computer vision. 23-50 - Suman Turpati, Venkatanarayana Moram:
Implementation of robust virtual sensing algorithm in active noise control to improve silence zone. 51-62 - Jiaye Li, Jing Wang:
Digital animation multimedia information synthesis based on mixed reality framework with specialized analysis on speech data. 63-76 - Sippee Bharadwaj, Purnendu Bikash Acharjee:
Exploring human voice prosodic features and the interaction between the excitation signal and vocal tract for Assamese speech. 77-93
Special Section on 'Arabic NLP and Speech Recognition Using Deep and Machine Learning'
- Sandra Rizkallah, Amir F. Atiya, Samir I. Shaheen, Hossam ElDin Mahgoub:
ArSphere: Arabic word vectors embedded in a polar sphere. 95-111 - Ammar Farid Ghori, Aisha Waheed, Maria Waqas, Aqsa Mehmood, Syed Abbas Ali:
Acoustic modelling using deep learning for Quran recitation assistance. 113-121 - Ftoon Abu Shaqra, Rehab Duwairi, Mahmoud Al-Ayyoub:
A multi-modal deep learning system for Arabic emotion recognition. 123-139 - Mourad Loukam:
Comparison of Naïve Bayes with graph based methods for keyphrase extraction in modern standard Arabic language. 141-150 - Hasna Awwad, Majdi Sawalha, Areej Allawzi, Sane Yagi:
Building translator-oriented English-Arabic physics glossary from domain corpus. 151-162
Regular Articles
- Shubam Sachdeva, Haoyao Ruan, Ghassan Hamarneh, Dawn M. Behne, Allard Jongman, Joan A. Sereno, Yue Wang:
Plain-to-clear speech video conversion for enhanced intelligibility. 163-184 - Mahdi Barhoush, Ahmed Hallawa, Anke Schmeink:
Speaker identification and localization using shuffled MFCC features and deep learning. 185-196 - Chaitanya Jannu, Sunny Dayal Vanambathina:
Weibull and Nakagami speech priors based regularized NMF with adaptive wiener filter for speech enhancement. 197-209 - Bidhan Barai, Nibaran Das, Subhadip Basu, Mita Nasipuri:
An empirical study on analysis window functions for text-independent speaker recognition. 211-220 - Shrikant Malviya, Rohit Mishra, Santosh Kumar Barnwal, Uma Shanker Tiwary:
A framework for quality assessment of synthesised speech using learning-based objective evaluation. 221-243 - Elif Özen Acarbay, Nalan Özkurt:
Performance analysis of the speech enhancement application with wavelet transform domain adaptive filters. 245-258 - (Withdrawn) RETRACTED ARTICLE: Research on the application of speech recognition in computer network technology in the era of big data. 259
Volume 26, Number 2, July 2023
Special Section on 'Analysis of Speech technology and its applications'
- Orken Zh. Mamyrbayev, Dina O. Oralbekova, Keylan Alimhan, Bulbul M. Nuranbayeva:
Hybrid end-to-end model for Kazakh speech recognition. 261-270 - Mohsen Shahrokhi, Behnaz Khodadadi:
Perception of impoliteness in disagreement speech acts among Iranian upper-intermediate EFL students: a gender perspective. 271-285 - Bhuvaneshwari Jolad, Rajashri Khanai:
An approach for speech enhancement with dysarthric speech recognition using optimization based machine learning frameworks. 287-305 - Moses Effiong Ekpenyong, Odudu-Obong Uwem Udocox:
Mining speech signal patterns for robust speaker variability classification. 307-336 - Christos P. Loizou, Marios Pantzaris:
An automated speech analysis system for the detection of cognitive decline in elderly. 337-353 - Manju Ramrao Bhosle, Nagesh Kallollu Narayaswamy:
SHO based Deep Residual network and hierarchical speech features for speech enhancement. 355-370
Regular Articles
- Haihua Jiang, Bin Hu, Zhenyu Liu, Gang Wang, Lan Zhang:
A radius-incorporated localized multiple kernel learning algorithm for detecting depression in speech. 371-378 - Mohsen A. M. El-Bendary, Sabry S. Nassar:
Different attacks presence considerations: analyzing the simple and efficient self-marked algorithm performance for highly-sensitive audio signals contents verification. 379-394 - Björn Herrmann:
The perception of artificial-intelligence (AI) based synthesized speech in younger and older adults. 395-415 - Carlo Schirru, Shahla Simin, Paolo Mengoni, Alfredo Milani:
Linguistic analysis for emotion recognition: a case of Chinese speakers. 417-432 - K. Aditya Shastry:
Ensemble machine learning regression model based predictive framework for Parkinson's UPDRS motor score prediction from speech data. 433-457 - Saurabh Garg, Haoyao Ruan, Ghassan Hamarneh, Dawn M. Behne, Allard Jongman, Joan A. Sereno, Yue Wang:
Mouth2Audio: intelligible audio synthesis from videos with distinctive vowel articulation. 459-474 - Mohit Dua, Akanksha, Shelza Dua:
Noise robust automatic speech recognition: review and analysis. 475-519 - Shakeel A. Sheikh, Md. Sahidullah, Fabrice Hirsch, Slim Ouni:
Stuttering detection using speaker representations and self-supervised contextual embeddings. 521-530 - Lina Tang:
A transformer-based network for speech recognition. 531-539 - Mohammed Tellai, Lijian Gao, Qirong Mao:
An efficient speech emotion recognition based on a dual-stream CNN-transformer fusion network. 541-557
Volume 26, Number 3, September 2023
- Prabira Kumar Sethy, Millee Panigrahi, K. Vijayakumar, Santi Kumari Behera:
Machine learning based classification of EEG signal for detection of child epileptic seizure without snipping. 559-570 - S. Ramesh, S. Gomathi, S. Sasikala, Thapasimuthu Rajeswari Saravanan:
Automatic speech emotion detection using hybrid of gray wolf optimizer and naïve Bayes. 571-578 - J. V. Thomas Abraham, A. Nayeemulla Khan, A. Shahina:
A deep learning approach for robust speaker identification using chroma energy normalized statistics and mel frequency cepstral coefficients. 579-587 - K. Manjunath, G. N. Kodanda Ramaiah, M. N. Giri Prasad:
Optimal secure XOR encryption with dynamic key for effective audio steganography. 589-598 - Sajjadali Raza, Heriberto Cuayáhuitl:
A comparison of neural-based visual recognisers for speech activity detection. 599-608 - Adil Chakhtouna, Sara Sekkate, Abdellah Adib:
Speaker and gender dependencies in within/cross linguistic Speech Emotion Recognition. 609-625 - Heiner Ludwig, Thorsten Schmidt, Mathias Kühn:
Voice user interfaces in manufacturing logistics: a literature review. 627-639 - Marek B. Trawicki:
Automatic age recognition, call-type classification, and speaker identification of Zebra Finches (Taeniopygia guttata) using hidden Markov models (HMMs). 641-650 - Kodali Radha, Mohan Bansal:
Towards modeling raw speech in gender identification of children using sincNet over ERB scale. 651-663 - Eiman Alsharhan, Allan Ramsay:
Robust automatic accent identification based on the acoustic evidence. 665-680 - Ricky Mohanty, Hemanta Kumar Bhuyan, Subhendu Kumar Pani, Vinayakumar Ravi, Moez Krichen:
Bird species recognition using spiking neural network along with distance based fuzzy co-clustering. 681-694 - Rakesh Reddy Yakkati, Sreenivasa Reddy Yeduri, Rajesh Kumar Tripathy, Linga Reddy Cenkeramaddi:
Time frequency domain deep CNN for automatic background classification in speech signals. 695-706 - Jharna Agrawal, Manish Gupta, Hitendra Garg:
Monaural speech separation using WT-Conv-TasNet for hearing aids. 707-720 - A. Suresh Rao, A. Pramod Reddy, Pragathi Vulpala, K. Shwetha Rani, P. Hemalatha:
Deep learning structure for emotion prediction using MFCC from native languages. 721-733 - V. Srinivasarao:
Speech signal analysis and enhancement using combined wavelet Fourier transform with stacked deep learning architecture. 735-742 - James A. Rodger, Justin Piper:
Assessing American presidential candidates using principles of ontological engineering, word sense disambiguation, data envelope analysis and qualitative comparative analysis. 743-764 - Shivam Dwivedi, Sanjukta Ghosh, Satyam Dwivedi:
Binary classifier for identification of stammering instances in Hindi speech data. 765-774 - Mohamed Daouad, Fadoua Ataa-Allah, El Wardani Dadi:
An automatic speech recognition system for isolated Amazigh word using 1D & 2D CNN-LSTM architecture. 775-787 - Khamis A. Al-Karawi, Duraid Y. Mohammed:
Using combined features to improve speaker verification in the face of limited reverberant data. 789-799 - P. Sudhakar, K. Sreenivasa Rao, Pabitra Mitra:
Unsupervised spoken term discovery using pseudo lexical induction. 801-816
Volume 26, Number 4, December 2023
- Yingjie Zhang, Liu Liu:
Multi-task learning for X-vector based speaker recognition. 817-823 - Irshed Hussain, Pinki Roy:
A hybrid adaptive neuro-fuzzy approach for automatic spoken digit recognition. 825-832 - Ahmad A. M. Abushariah, Mohammad A. M. Abushariah, Teddy Surya Gunawan, J. Chebil, Assal A. M. Alqudah, Hua-Nong Ting, Mumtaz Begum Mustafa:
Fusion of speech and handwritten signatures biometrics for person identification. 833-850 - Yibo Duan, Yanhua Long, Yijie Li:
CI-Mix: cut instance mix for robust speaker verification. 851-857 - R. Benazir Begam, M. Palanivelan:
A speech based diagnostic method for Alzheimer disease using machine learning. 859-867 - Aluru V. N. M. Hemateja, Gopikrishnan Kondakath, Susruta Das, Mohanaprasad Kothandaraman, S. Shoba, Abhishek Pandey, Rajin Babu, Abhinav Jain:
Novel data augmentation for named entity recognition. 869-878 - Aluru V. N. M. Hemateja, Gopikrishnan Kondakath, Susruta Das, Mohanaprasad Kothandaraman, S. Shoba, Abhishek Pandey, Rajin Babu, Abhinav Jain:
Correction to: Novel data augmentation for named entity recognition. 879 - Zhor Benhafid, Sid-Ahmed Selouani, Abderrahmane Amrouche, Mohammed Sidi Yakoub:
Attention-based factorized TDNN for a noise-robust and spoof-aware speaker verification system. 881-894 - Li Li, Yanhua Long, Dongxing Xu, Yijie Li:
Boosting Character-based Mandarin ASR via Chinese Pinyin Representation. 895-902 - Ghayas Ahmed, Aadil Ahmad Lawaye:
End-to-end ASR framework for Indian-English accent: using speech CNN-based segmentation. 903-918 - Om Prakash Swain, H. Hemanth, Puneet Saran, Mohanaprasad Kothandaraman, Logesh Ravi, Hardik Sailor, K. S. Rajesh:
Robust and efficient keyword spotting using a bidirectional attention LSTM. 919-931 - Salam Nandakishor, Debadatta Pati:
Usefulness of glottal excitation source information for audio-visual speech recognition system. 933-945 - Nishant Barsainyan, Dileep Kumar Singh:
Optimized cross-corpus speech emotion recognition framework based on normalized 1D convolutional neural network with data augmentation and feature selection. 947-961 - Navneet Upadhyay:
Psychoacoustic model-driven spectral subtraction for monaural speech enhancement. 963-979 - Sudhansu Sekhar Nayak, Anand D. Darji, Prashant K. Shah:
Identification of Parkinson's disease from speech signal using machine learning approach. 981-990 - Chiron Bang, Nicholas Bogdanovic, Gali Deutsch, Oge Marques:
Machine learning for the diagnosis of Parkinson's disease using speech analysis: a systematic review. 991-998 - Kai Wang, Jingjing Liu, Yizhou Peng, Hao Huang:
Neural RAPT: deep learning-based pitch tracking with prior algorithmic knowledge instillation. 999-1015 - Rajeev Rajan, T. V. Hridya Raj:
SENet-based speech emotion recognition using synthesis-style transfer data augmentation. 1017-1030 - Gebremichael Kibret Sheferaw, Waweru Mwangi, Michael W. Kimwele, Adane Letta Mamuye:
Waveform based speech coding using nonlinear predictive techniques: a systematic review. 1031-1059 - K. S. Nataraj, Prem C. Pandey, Hirak Dasgupta:
Estimation of place of articulation of fricatives from spectral features. 1061-1078 - Khamis A. Al-Karawi:
Real-time adaptive training for forensic speaker verification in reverberation conditions. 1079-1089 - Noor D. AL-Shakarchy, Huda Rageb, Mais Saad Safoq:
Gender and age-evolution detection based on audio forensic analysis using light deep neural network. 1091-1098 - Mohammed Tellai, Qirong Mao:
CCTG-NET: Contextualized Convolutional Transformer-GRU Network for speech emotion recognition. 1099-1116 - A. Karthik, J. L. Mazher Iqbal:
An optimized convolutional neural network for speech enhancement. 1117-1129 - Suhad Al-Issa, Mahmoud Al-Ayyoub, Osama Daifallah Al-Khaleel, Nouh Sabri Elmitwally:
Building a neural speech recognizer for quranic recitations. 1131-1151 - Arvind Kumar, Sandeep Singh Solanki, Mahesh Chandra:
Effect of background Indian music on performance of speech recognition models for Hindi databases. 1153-1164 - Souha Ayadi, Zied Lachiri:
Deep neural network architectures for audio emotion recognition performed on song and speech modalities. 1165-1181
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.