default search action
Ahmed Hussen Abdelaziz
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [c28]Gautam Krishna, Sameer Dharur, Oggi Rudovic, Pranay Dighe, Saurabh Adya, Ahmed Hussen Abdelaziz, Ahmed H. Tewfik:
Modality Drop-Out for Multimodal Device Directed Speech Detection Using Verbal and Non-Verbal Features. ICASSP 2024: 8240-8244 - [i14]Jee-weon Jung, Wangyou Zhang, Jiatong Shi, Zakaria Aldeneh, Takuya Higuchi, Barry-John Theobald, Ahmed Hussen Abdelaziz, Shinji Watanabe:
ESPnet-SPK: full pipeline speaker embedding toolkit with reproducible recipes, self-supervised front-ends, and off-the-shelf models. CoRR abs/2401.17230 (2024) - [i13]Zakaria Aldeneh, Takuya Higuchi, Jee-weon Jung, Skyler Seto, Tatiana Likhomanenko, Stephen Shum, Ahmed Hussen Abdelaziz, Shinji Watanabe, Barry-John Theobald:
Can you Remove the Downstream Model for Speaker Recognition with Self-Supervised Speech Features? CoRR abs/2402.00340 (2024) - [i12]Satyam Kumar, Sai Srujana Buddi, Utkarsh Oggy Sarawgi, Vineet Garg, Shivesh Ranjan, Ognjen Rudovic, Ahmed Hussen Abdelaziz, Saurabh Adya:
Comparative Analysis of Personalized Voice Activity Detection Systems: Assessing Real-World Effectiveness. CoRR abs/2406.09443 (2024) - [i11]Shruti Palaskar, Oggi Rudovic, Sameer Dharur, Florian Pesce, Gautam Krishna, Aswin Sivaraman, Jack Berkowitz, Ahmed Hussen Abdelaziz, Saurabh Adya, Ahmed H. Tewfik:
Multimodal Large Language Models with Fusion Low Rank Adaptation for Device Directed Speech Detection. CoRR abs/2406.09617 (2024) - [i10]Li-Wei Chen, Takuya Higuchi, He Bai, Ahmed Hussen Abdelaziz, Alexander Rudnicky, Shinji Watanabe, Tatiana Likhomanenko, Barry-John Theobald, Zakaria Aldeneh:
Exploring Prediction Targets in Masked Pre-Training for Speech Foundation Models. CoRR abs/2409.10788 (2024) - [i9]Zakaria Aldeneh, Takuya Higuchi, Jee-weon Jung, Li-Wei Chen, Stephen Shum, Ahmed Hussen Abdelaziz, Shinji Watanabe, Tatiana Likhomanenko, Barry-John Theobald:
Speaker-IPL: Unsupervised Learning of Speaker Characteristics with i-Vector based Pseudo-Labels. CoRR abs/2409.10791 (2024) - 2023
- [c27]Oggi Rudovic, Wonil Chang, Vineet Garg, Pranay Dighe, Pramod Simha, Jack Berkowitz, Ahmed Hussen Abdelaziz, Sachin Kajarekar, Erik Marchi, Saurabh Adya:
Less Is More: A Unified Architecture for Device-Directed Speech Detection with Multiple Invocation Types. ICASSP 2023: 1-5 - [i8]Gautam Krishna, Sameer Dharur, Oggi Rudovic, Pranay Dighe, Saurabh Adya, Ahmed Hussen Abdelaziz, Ahmed H. Tewfik:
Modality Dropout for Multimodal Device Directed Speech Detection using Verbal and Non-Verbal Features. CoRR abs/2310.15261 (2023) - 2022
- [c26]Vineet Garg, Ognjen Rudovic, Pranay Dighe, Ahmed Hussen Abdelaziz, Erik Marchi, Saurabh Adya, Chandra Dhir, Ahmed H. Tewfik:
Device-Directed Speech Detection: Regularization via Distillation for Weakly-Supervised Models. INTERSPEECH 2022: 1258-1262 - [i7]Vineet Garg, Ognjen Rudovic, Pranay Dighe, Ahmed Hussen Abdelaziz, Erik Marchi, Saurabh Adya, Chandra Dhir, Ahmed H. Tewfik:
Device-Directed Speech Detection: Regularization via Distillation for Weakly-Supervised Models. CoRR abs/2203.15975 (2022) - 2021
- [c25]Nataniel Ruiz, Barry-John Theobald, Anurag Ranjan, Ahmed Hussen Abdelaziz, Nicholas Apostoloff:
MorphGAN: One-Shot Face Synthesis GAN for Detecting Recognition Bias. BMVC 2021: 348 - [c24]Zakaria Aldeneh, Anushree Prasanna Kumar, Barry-John Theobald, Erik Marchi, Sachin Kajarekar, Devang Naik, Ahmed Hussen Abdelaziz:
On The Role of Visual Cues in Audiovisual Speech Enhancement. ICASSP 2021: 8423-8427 - [c23]Ahmed Hussen Abdelaziz, Anushree Prasanna Kumar, Chloe Seivwright, Gabriele Fanelli, Justin Binder, Yannis Stylianou, Sachin Kajareker:
Audiovisual Speech Synthesis using Tacotron2. ICMI 2021: 503-511 - 2020
- [c22]Ahmed Hussen Abdelaziz, Barry-John Theobald, Paul Dixon, Reinhard Knothe, Nicholas Apostoloff, Sachin Kajareker:
Modality Dropout for Improved Performance-driven Talking Faces. ICMI 2020: 378-386 - [i6]Zakaria Aldeneh, Anushree Prasanna Kumar, Barry-John Theobald, Erik Marchi, Sachin Kajarekar, Devang Naik, Ahmed Hussen Abdelaziz:
Self-supervised Learning of Visual Speech Features with Audiovisual Speech Enhancement. CoRR abs/2004.12031 (2020) - [i5]Ahmed Hussen Abdelaziz, Barry-John Theobald, Paul Dixon, Reinhard Knothe, Nicholas Apostoloff, Sachin Kajareker:
Modality Dropout for Improved Performance-driven Talking Faces. CoRR abs/2005.13616 (2020) - [i4]Ahmed Hussen Abdelaziz, Anushree Prasanna Kumar, Chloe Seivwright, Gabriele Fanelli, Justin Binder, Yannis Stylianou, Sachin Kajarekar:
Audiovisual Speech Synthesis using Tacotron2. CoRR abs/2008.00620 (2020) - [i3]Nataniel Ruiz, Barry-John Theobald, Anurag Ranjan, Ahmed Hussen Abdelaziz, Nicholas Apostoloff:
MorphGAN: One-Shot Face Synthesis GAN for Detecting Recognition Bias. CoRR abs/2012.05225 (2020)
2010 – 2019
- 2019
- [c21]Ahmed Hussen Abdelaziz, Barry-John Theobald, Justin Binder, Gabriele Fanelli, Paul Dixon, Nicholas Apostoloff, Thibaut Weise, Sachin Kajareker:
Speaker-Independent Speech-Driven Visual Speech Synthesis using Domain-Adapted Acoustic Models. ICMI 2019: 220-225 - [i2]Ahmed Hussen Abdelaziz, Barry-John Theobald, Justin Binder, Gabriele Fanelli, Paul Dixon, Nicholas Apostoloff, Thibaut Weise, Sachin Kajareker:
Speaker-Independent Speech-Driven Visual Speech Synthesis using Domain-Adapted Acoustic Models. CoRR abs/1905.06860 (2019) - [i1]Ahmed Hussen Abdelaziz, Shuo-Yiin Chang, Nelson Morgan, Erik Edwards, Dorothea Kolossa, Dan Ellis, David A. Moses, Edward F. Chang:
On Neural Phone Recognition of Mixed-Source ECoG Signals. CoRR abs/1912.05869 (2019) - 2018
- [j3]Ahmed Hussen Abdelaziz:
Comparing Fusion Models for DNN-Based Audiovisual Continuous Speech Recognition. IEEE ACM Trans. Audio Speech Lang. Process. 26(3): 475-484 (2018) - 2017
- [c20]Ahmed Hussen Abdelaziz:
Improving acoustic modeling using audio-visual speech. ICME 2017: 1081-1086 - [c19]Ahmed Hussen Abdelaziz:
Turbo Decoders for Audio-Visual Continuous Speech Recognition. INTERSPEECH 2017: 3667-3671 - [c18]Ahmed Hussen Abdelaziz:
NTCD-TIMIT: A New Database and Baseline for Noise-Robust Audio-Visual Speech Recognition. INTERSPEECH 2017: 3752-3756 - 2016
- [b1]Ahmed Serag Eldin Hussen Abdelaziz:
Noise-robust HMM-based pattern recognition using multimodal features and observation uncertainties. Ruhr University Bochum, 2016 - [j2]Ahmed Hussen Abdelaziz, Dorothea Kolossa:
General hybrid framework for uncertainty-decoding-based automatic speech recognition systems. Speech Commun. 79: 1-13 (2016) - [c17]Sebastian Gergen, Steffen Zeiler, Ahmed Hussen Abdelaziz, Dorothea Kolossa:
New Insights into Turbo-Decoding-Based AVSR with Dynamic StreamWeights. ITG Symposium on Speech Communication 2016: 1-5 - [c16]Mahdie Karbasi, Ahmed Hussen Abdelaziz, Dorothea Kolossa:
Twin-HMM-based non-intrusive speech intelligibility prediction. ICASSP 2016: 624-628 - [c15]Mahdie Karbasi, Ahmed Hussen Abdelaziz, Hendrik Meutzner, Dorothea Kolossa:
Blind Non-Intrusive Speech Intelligibility Prediction Using Twin-HMMs. INTERSPEECH 2016: 625-629 - [c14]Steffen Zeiler, Hendrik Meutzner, Ahmed Hussen Abdelaziz, Dorothea Kolossa:
Introducing the Turbo-Twin-HMM for Audio-Visual Speech Enhancement. INTERSPEECH 2016: 1750-1754 - [c13]Sebastian Gergen, Steffen Zeiler, Ahmed Hussen Abdelaziz, Robert M. Nickel, Dorothea Kolossa:
Dynamic Stream Weighting for Turbo-Decoding-Based Audiovisual ASR. INTERSPEECH 2016: 2135-2139 - 2015
- [j1]Ahmed Hussen Abdelaziz, Steffen Zeiler, Dorothea Kolossa:
Learning Dynamic Stream Weights For Coupled-HMM-Based Audio-Visual Speech Recognition. IEEE ACM Trans. Audio Speech Lang. Process. 23(5): 863-876 (2015) - [c12]Ahmed Hussen Abdelaziz, Shinji Watanabe, John R. Hershey, Emmanuel Vincent, Dorothea Kolossa:
Uncertainty propagation through deep neural networks. INTERSPEECH 2015: 3561-3565 - [c11]Ramón Fernandez Astudillo, Shinji Watanabe, Ahmed Hussen Abdelaziz, Dorothea Kolossa:
Robust speech processing using observation uncertainty and uncertainty propagation: session and paper overview. INTERSPEECH 2015 - 2014
- [c10]Samer Al Moubayed, Jonas Beskow, Bajibabu Bollepalli, Joakim Gustafson, Ahmed Hussen Abdelaziz, Martin Johansson, Maria Koutsombogera, José David Águas Lopes, Jekaterina Novikova, Catharine Oertel, Gabriel Skantze, Kalin Stefanov, Gül Varol:
Human-robot collaborative tutoring using multiparty multimodal spoken dialogue. HRI 2014: 112-113 - [c9]Ahmed Hussen Abdelaziz, Steffen Zeiler, Dorothea Kolossa:
A newem estimationof dynamic stream weights for coupled-HMM-based audio-visual ASR. ICASSP 2014: 1527-1531 - [c8]Ahmed Hussen Abdelaziz, Dorothea Kolossa:
Dynamic stream weight estimation in coupled-HMM-based audio-visual speech recognition using multilayer perceptrons. INTERSPEECH 2014: 1144-1148 - [c7]Maria Koutsombogera, Samer Al Moubayed, Bajibabu Bollepalli, Ahmed Hussen Abdelaziz, Martin Johansson, José David Águas Lopes, Jekaterina Novikova, Catharine Oertel, Kalin Stefanov, Gül Varol:
The Tutorbot Corpus ― A Corpus for Studying Tutoring Behaviour in Multiparty Face-to-Face Spoken Dialogue. LREC 2014: 4196-4201 - 2013
- [c6]Ahmed Hussen Abdelaziz, Steffen Zeiler, Dorothea Kolossa:
Twin-HMM-based audio-visual speech enhancement. ICASSP 2013: 3726-3730 - [c5]Ahmed Hussen Abdelaziz, Steffen Zeiler, Dorothea Kolossa, Volker Leutnant, Reinhold Haeb-Umbach:
GMM-based significance decoding. ICASSP 2013: 6827-6831 - [c4]Samer Al Moubayed, Jonas Beskow, Bajibabu Bollepalli, Ahmed Hussen Abdelaziz, Martin Johansson, Maria Koutsombogera, José David Águas Lopes, Jekaterina Novikova, Catharine Oertel, Gabriel Skantze, Kalin Stefanov, Gül Varol:
Tutoring Robots - Multiparty Multimodal Social Dialogue with an Embodied Tutor. eNTERFACE 2013: 80-113 - [c3]Ahmed Hussen Abdelaziz, Steffen Zeiler, Dorothea Kolossa:
Using twin-HMM-based audio-visual speech enhancement as a front-end for robust audio-visual speech recognition. INTERSPEECH 2013: 867-871 - 2012
- [c2]Ahmed Hussen Abdelaziz, Steffen Zeiler, Dorothea Kolossa:
Audio-Visual Speech Recognition for Uncertain Acoustical Observations. ITG Conference on Speech Communication 2012: 1-4 - [c1]Ahmed Hussen Abdelaziz, Dorothea Kolossa:
Decoding of Uncertain Features Using the Posterior Distribution of the Clean Data for Robust Speech Recognition. INTERSPEECH 2012: 2634-2637
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-10-24 20:31 CEST by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint