default search action
25th SPECOM 2023: Dharwad, India - Part II
- Alexey Karpov, K. Samudravijaya, K. T. Deepak, Rajesh M. Hegde, Shyam S. Agrawal, S. R. Mahadeva Prasanna:
Speech and Computer - 25th International Conference, SPECOM 2023, Dharwad, India, November 29 - December 2, 2023, Proceedings, Part II. Lecture Notes in Computer Science 14339, Springer 2023, ISBN 978-3-031-48311-0
Industrial Speech and Language Technology
- Gauri Deshpande, Björn W. Schuller, Pallavi Deshpande, Anuradha Rajiv Joshi, S. K. Oza, Sachin Patel:
Analysing Breathing Patterns in Reading and Spontaneous Speech. 3-17 - Gnana Praveen Rajasekhar, Jahangir Alam:
Audio-Visual Speaker Verification via Joint Cross-Attention. 18-31 - Sunil Kumar Kopparapu:
A Novel Scheme to Classify Read and Spontaneous Speech. 32-45 - Pradeep Rangappa, Aditya Kiran Brahma, Venkatesh Vayyavuru, Rishi Yadav, Hemant Misra, Kasturi Karuna:
Analysis of a Hinglish ASR System's Performance for Fraud Detection. 46-58 - Veronica Khaustova, Evgeny Pyshkin, Victor Khaustov, John Blake, Natalia Bogach:
CAPTuring Accents: An Approach to Personalize Pronunciation Training for Learners with Different L1 Backgrounds. 59-70
Speech Technology for Under-Resourced Languages
- Vishwa Gupta, Gilles Boulianne:
Improvements in Language Modeling, Voice Activity Detection, and Lexicon in OpenASR21 Low Resource Languages. 73-86 - Irina S. Kipyatkova, Ildar Kagirov:
Phone Durations Modeling for Livvi-Karelian ASR. 87-99 - Sougata Mukherjee, Jagabandhu Mishra, S. R. Mahadeva Prasanna:
Significance of Indic Self-supervised Speech Representations for Indic Under-Resourced ASR. 100-113 - Achintya Kumar Sarkar, Tulika Basu, Rajib Roy, Joyanta Basu, Michael Tongbram, Yambem Jina Chanu, Priyanka Dwivedi:
Study of Various End-to-End Keyword Spotting Systems on the Bengali Language Under Low-Resource Condition. 114-126 - Ashwini Dasare, Amartya Chowdhury, Aditya Srinivas Menon, Konjengbam Anand, K. T. Deepak, S. R. M. Prasanna:
Bridging the Gap: Towards Linguistic Resource Development for the Low-Resource Lambani Language. 127-139 - Ankita, Shambhavi, Syed Shahnawazuddin:
Studying the Effect of Frame-Level Concatenation of GFCC and TS-MFCC Features on Zero-Shot Children's ASR. 140-150 - Raviraj Joshi, Nikesh Garera:
Code-Mixed Text-to-Speech Synthesis Under Low-Resource Constraints. 151-163 - Abhayjeet Singh, Anjali Jayakumar, Deekshitha G, Hitesh Kumar, Jesuraja Bandekar, Sandhya Badiger, Sathvik Udupa, Saurabh Kumar, Prasanta Kumar Ghosh:
An End-to-End TTS Model in Chhattisgarhi, a Low-Resource Indian Language. 164-172 - Abhayjeet Singh, Arjun Singh Mehta, Ashish Khuraishi K. S, Deekshitha G, Gauri Date, Jai Nanavati, Jesuraja Bandekar, Karnalius Basumatary, Karthika P, Sandhya Badiger, Sathvik Udupa, Saurabh Kumar, Prasanta Kumar Ghosh, Prashanthi V, Priyanka Pai, Raoul Nanavati, Sai Praneeth Reddy Mora, Srinivasa Raghavan K. M.:
An ASR Corpus in Chhattisgarhi, a Low Resource Indian Language. 173-181 - Ashwini Dasare, B. Lohith Reddy, A. Sai Chandra Koushik, B. Sai Raj, V. Krishna Sai Rohith, Satisha Basavaraju, K. T. Deepak:
Cross Lingual Style Transfer Using Multiscale Loss Function for Soliga: A Low Resource Tribal Language. 182-194 - Leena Dihingia, Prashant Bannulmath, Amartya Chowdhury, S. R. M. Prasanna, K. T. Deepak, Tehreem Sheikh:
Preliminary Analysis of Lambani Vowels and Vowel Classification Using Acoustic Features. 195-207 - Navneet Kaur, Prasanta Kumar Ghosh:
Curriculum Learning Based Approach for Faster Convergence of TTS Model. 208-221 - Krisangi Saikia, Shakuntala Mahanta:
Rhythm Measures and Language Endangerment: The Case of Deori. 222-230 - Swapnil Fadte, Edna Vaz Fernandes, Hanumant Redkar, Jyoti D. Pawar:
Konkani Phonetic Transcription System 1.0. 231-240
Speech Analysis and Synthesis
- Ishika Gupta, Hema A. Murthy:
E-TTS: Expressive Text-to-Speech Synthesis for Hindi Using Data Augmentation. 243-257 - Lalaram Arya, Amartya Chowdhury, S. R. Mahadeva Prasanna:
Direct Vs Cascaded Speech-to-Speech Translation Using Transformer. 258-270 - Rahul Jaiswal, Anu Priya:
Deep Learning Based Speech Quality Assessment Focusing on Noise Effects. 271-282 - Kirtana Sunil Phatnani, Hemant A. Patil:
Quantifying the Emotional Landscape of Music with Three Dimensions. 283-294 - S. Uthiraa, Hemant A. Patil:
Analysis of Mandarin vs English Language for Emotional Voice Conversion. 295-306 - Md Shahidul Alam, Abderrahim Fathan, Jahangir Alam:
Audio DeepFake Detection Employing Multiple Parametric Exponential Linear Units. 307-321 - Jhansi Mallela, Prasanth Sai Boyina, Chiranjeevi Yarra:
A Comparison of Learned Representations with Jointly Optimized VAE and DNN for Syllable Stress Detection. 322-334 - Priyanka Gupta, Rajul Acharya, Ankur T. Patil, Hemant A. Patil:
On the Asymptotic Behaviour of the Speech Signal. 335-343 - Salam Nandakishor, Debadatta Pati:
Improvement of Audio-Visual Keyword Spotting System Accuracy Using Excitation Source Feature. 344-356 - Liudmila Bukreeva, Daria Guseva, Mikhail Dolgushin, Vera Evdokimova, Vasilisa Obotnina:
Developing a Question Answering System on the Material of Holocaust Survivors' Testimonies in Russian. 357-366 - Seema Lokhandwala, Rohit Sinha, Sreeram Ganji, Balakrishna Pailla:
Decoding Asian Elephant Vocalisations: Unravelling Call Types, Context-Specific Behaviors, and Individual Identities. 367-379 - Shahid Aziz, Syed Shahnawazuddin:
Enhancing Children's Short Utterance Based ASV Using Data Augmentation Techniques and Feature Concatenation Approach. 380-394 - Shahid Aziz, Shivesh Pushp, Syed Shahnawazuddin:
Studying the Effectiveness of Data Augmentation and Frequency-Domain Linear Prediction Coefficients in Children's Speaker Verification Under Low-Resource Conditions. 395-406 - Aditya Pusuluri, Aastha Kachhi, Hemant A. Patil:
Constant-Q Based Harmonic and Pitch Features for Normal vs. Pathological Infant Cry Classification. 407-420 - Monil Charola, Siddharth Rathod, Hemant A. Patil:
Robustness of Whisper Features for Infant Cry Classification. 421-433
Speaker and Language Identification, Verification, and Diarization
- Jagabandhu Mishra, Mrinmoy Bhattacharjee, S. R. Mahadeva Prasanna:
I-MSV 2022: Indic-Multilingual and Multi-sensor Speaker Verification Challenge. 437-445 - Abderrahim Fathan, Jahangir Alam, Xiaolin Zhu:
Multi-task Learning over Mixup Variants for the Speaker Verification Task. 446-460 - Sean Monteiro, Ananya Angra, Muralikrishna H, Veena Thenkanidiyoor, Aroor Dinesh Dileep:
Exploring the Impact of Different Approaches for Spoken Dialect Identification of Konkani Language. 461-474 - Urvashi Goswami, H. Muralikrishna, Aroor Dinesh Dileep, Veena Thenkanidiyoor:
Adversarially Trained Hierarchical Attention Network for Domain-Invariant Spoken Language Identification. 475-489 - Raj Prakash Gohil, Ramya Viswanathan, Saurabh Agrawal, C. M. Vikram, Madhu R. Kamble, Kamini Sabu, M. Ali Basha Shaik, Krishna K. S. Rajesh:
Ensemble of Incremental System Enhancements for Robust Speaker Diarization in Code-Switched Real-Life Audios. 490-502 - Shivang Gupta, Kowshik Siva Sai Motepalli, Ravi Kumar, Vamshi Raghu Simha Narasinga, Mirishkar Sai Ganesh, Anil Kumar Vuppala:
Enhancing Language Identification in Indian Context Through Exploiting Learned Features with Wav2Vec2.0. 503-512 - Pavanitha Manche, Sahaja Nandyala, Jagabandhu Mishra, Gayathri Ananthanarayanan, S. R. Mahadeva Prasanna:
Design and Development of Voice OTP Authentication System. 513-528 - Kishan Pipariya, Debolina Pramanik, Puja Bharati, Sabyasachi Chandra, Shyamal Kumar Das Mandal:
End-to-End Native Language Identification Using a Modified Vision Transformer(ViT) from L2 English Speech. 529-538 - Moakala Tzudir, Rishith Sadashiv T. N., Ayush Agarwal, S. R. Mahadeva Prasanna:
Dialect Identification in Ao Using Modulation-Based Representation. 539-549 - Abderrahim Fathan, Jahangir Alam:
Self-supervised Speaker Verification Employing Augmentation Mix and Self-augmented Training-Based Clustering. 550-563
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.