default search action
Leda Sari
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [c21]Jiamin Xie, Ke Li, Jinxi Guo, Andros Tjandra, Yuan Shangguan, Leda Sari, Chunyang Wu, Junteng Jia, Jay Mahadeokar, Ozlem Kalinli:
Dynamic ASR Pathways: An Adaptive Masking Approach Towards Efficient Pruning of a Multilingual ASR Model. ICASSP 2024: 12201-12205 - 2023
- [c20]Tejas Jayashankar, Jilong Wu, Leda Sari, David Kant, Vimal Manohar, Qing He:
Self-Supervised Representations for Singing Voice Conversion. ICASSP 2023: 1-5 - [c19]Florian L. Kreyssig, Yangyang Shi, Jinxi Guo, Leda Sari, Abdel-rahman Mohamed, Philip C. Woodland:
Biased Self-supervised Learning for ASR. INTERSPEECH 2023: 4948-4952 - [c18]Matthew Le, Apoorv Vyas, Bowen Shi, Brian Karrer, Leda Sari, Rashel Moritz, Mary Williamson, Vimal Manohar, Yossi Adi, Jay Mahadeokar, Wei-Ning Hsu:
Voicebox: Text-Guided Multilingual Universal Speech Generation at Scale. NeurIPS 2023 - [i12]Philipp Klumpp, Pooja Chitkara, Leda Sari, Prashant Serai, Jilong Wu, Irina-Elena Veliche, Rongqing Huang, Qing He:
Synthetic Cross-accent Data Augmentation for Automatic Speech Recognition. CoRR abs/2303.00802 (2023) - [i11]Shuo Liu, Leda Sari, Chunyang Wu, Gil Keren, Yuan Shangguan, Jay Mahadeokar, Ozlem Kalinli:
Towards Selection of Text-to-speech Data to Augment ASR Training. CoRR abs/2306.00998 (2023) - [i10]Matthew Le, Apoorv Vyas, Bowen Shi, Brian Karrer, Leda Sari, Rashel Moritz, Mary Williamson, Vimal Manohar, Yossi Adi, Jay Mahadeokar, Wei-Ning Hsu:
Voicebox: Text-Guided Multilingual Universal Speech Generation at Scale. CoRR abs/2306.15687 (2023) - [i9]Roshan Sharma, Suyoun Kim, Daniel Lazar, Trang Le, Akshat Shrivastava, Kwanghoon Ahn, Piyush Kansal, Leda Sari, Ozlem Kalinli, Michael L. Seltzer:
Augmenting text for spoken language understanding with Large Language Models. CoRR abs/2309.09390 (2023) - [i8]Jiamin Xie, Ke Li, Jinxi Guo, Andros Tjandra, Yuan Shangguan, Leda Sari, Chunyang Wu, Junteng Jia, Jay Mahadeokar, Ozlem Kalinli:
Dynamic ASR Pathways: An Adaptive Masking Approach Towards Efficient Pruning of A Multilingual ASR Model. CoRR abs/2309.13018 (2023) - 2022
- [j3]Heting Gao, Xiaoxuan Wang, Sunghun Kang, Rusty Mina, Dias Issa, John B. Harvill, Leda Sari, Mark Hasegawa-Johnson, Chang D. Yoo:
Seamless equal accuracy ratio for inclusive CTC speech recognition. Speech Commun. 136: 76-83 (2022) - [c17]Kristen Grauman, Andrew Westbury, Eugene Byrne, Zachary Chavis, Antonino Furnari, Rohit Girdhar, Jackson Hamburger, Hao Jiang, Miao Liu, Xingyu Liu, Miguel Martin, Tushar Nagarajan, Ilija Radosavovic, Santhosh Kumar Ramakrishnan, Fiona Ryan, Jayant Sharma, Michael Wray, Mengmeng Xu, Eric Zhongcong Xu, Chen Zhao, Siddhant Bansal, Dhruv Batra, Vincent Cartillier, Sean Crane, Tien Do, Morrie Doulaty, Akshay Erapalli, Christoph Feichtenhofer, Adriano Fragomeni, Qichen Fu, Abrham Gebreselasie, Cristina González, James Hillis, Xuhua Huang, Yifei Huang, Wenqi Jia, Weslie Khoo, Jáchym Kolár, Satwik Kottur, Anurag Kumar, Federico Landini, Chao Li, Yanghao Li, Zhenqiang Li, Karttikeya Mangalam, Raghava Modhugu, Jonathan Munro, Tullie Murrell, Takumi Nishiyasu, Will Price, Paola Ruiz Puentes, Merey Ramazanova, Leda Sari, Kiran Somasundaram, Audrey Southerland, Yusuke Sugano, Ruijie Tao, Minh Vo, Yuchen Wang, Xindi Wu, Takuma Yagi, Ziwei Zhao, Yunyi Zhu, Pablo Arbeláez, David Crandall, Dima Damen, Giovanni Maria Farinella, Christian Fuegen, Bernard Ghanem, Vamsi Krishna Ithapu, C. V. Jawahar, Hanbyul Joo, Kris Kitani, Haizhou Li, Richard A. Newcombe, Aude Oliva, Hyun Soo Park, James M. Rehg, Yoichi Sato, Jianbo Shi, Mike Zheng Shou, Antonio Torralba, Lorenzo Torresani, Mingfei Yan, Jitendra Malik:
Ego4D: Around the World in 3, 000 Hours of Egocentric Video. CVPR 2022: 18973-18990 - [c16]Chunxi Liu, Michael Picheny, Leda Sari, Pooja Chitkara, Alex Xiao, Xiaohui Zhang, Mark Chou, Andres Alvarado, Caner Hazirbas, Yatharth Saraf:
Towards Measuring Fairness in Speech Recognition: Casual Conversations Dataset Transcriptions. ICASSP 2022: 6162-6166 - [i7]Florian L. Kreyssig, Yangyang Shi, Jinxi Guo, Leda Sari, Abdelrahman Mohamed, Philip C. Woodland:
Biased Self-supervised learning for ASR. CoRR abs/2211.02536 (2022) - 2021
- [b1]Leda Sari:
Learning speech embeddings for speaker adaptation and speech understanding. University of Illinois Urbana-Champaign, USA, 2021 - [j2]Leda Sari, Mark Hasegawa-Johnson, Samuel Thomas:
Auxiliary Networks for Joint Speaker Adaptation and Speaker Change Detection. IEEE ACM Trans. Audio Speech Lang. Process. 29: 324-333 (2021) - [j1]Leda Sari, Mark Hasegawa-Johnson, Chang D. Yoo:
Counterfactually Fair Automatic Speech Recognition. IEEE ACM Trans. Audio Speech Lang. Process. 29: 3515-3525 (2021) - [c15]Leda Sari, Kritika Singh, Jiatong Zhou, Lorenzo Torresani, Nayan Singhal, Yatharth Saraf:
A Multi-View Approach to Audio-Visual Speaker Verification. ICASSP 2021: 6194-6198 - [c14]Kiran Ramnath, Leda Sari, Mark Hasegawa-Johnson, Chang D. Yoo:
Worldly Wise (WoW) - Cross-Lingual Knowledge Fusion for Fact-based Visual Spoken-Question Answering. NAACL-HLT 2021: 1908-1919 - [i6]Leda Sari, Kritika Singh, Jiatong Zhou, Lorenzo Torresani, Nayan Singhal, Yatharth Saraf:
A Multi-View Approach To Audio-Visual Speaker Verification. CoRR abs/2102.06291 (2021) - [i5]Kristen Grauman, Andrew Westbury, Eugene Byrne, Zachary Chavis, Antonino Furnari, Rohit Girdhar, Jackson Hamburger, Hao Jiang, Miao Liu, Xingyu Liu, Miguel Martin, Tushar Nagarajan, Ilija Radosavovic, Santhosh Kumar Ramakrishnan, Fiona Ryan, Jayant Sharma, Michael Wray, Mengmeng Xu, Eric Zhongcong Xu, Chen Zhao, Siddhant Bansal, Dhruv Batra, Vincent Cartillier, Sean Crane, Tien Do, Morrie Doulaty, Akshay Erapalli, Christoph Feichtenhofer, Adriano Fragomeni, Qichen Fu, Christian Fuegen, Abrham Gebreselasie, Cristina González, James Hillis, Xuhua Huang, Yifei Huang, Wenqi Jia, Weslie Khoo, Jáchym Kolár, Satwik Kottur, Anurag Kumar, Federico Landini, Chao Li, Yanghao Li, Zhenqiang Li, Karttikeya Mangalam, Raghava Modhugu, Jonathan Munro, Tullie Murrell, Takumi Nishiyasu, Will Price, Paola Ruiz Puentes, Merey Ramazanova, Leda Sari, Kiran Somasundaram, Audrey Southerland, Yusuke Sugano, Ruijie Tao, Minh Vo, Yuchen Wang, Xindi Wu, Takuma Yagi, Yunyi Zhu, Pablo Arbeláez, David Crandall, Dima Damen, Giovanni Maria Farinella, Bernard Ghanem, Vamsi Krishna Ithapu, C. V. Jawahar, Hanbyul Joo, Kris Kitani, Haizhou Li, Richard A. Newcombe, Aude Oliva, Hyun Soo Park, James M. Rehg, Yoichi Sato, Jianbo Shi, Mike Zheng Shou, Antonio Torralba, Lorenzo Torresani, Mingfei Yan, Jitendra Malik:
Ego4D: Around the World in 3, 000 Hours of Egocentric Video. CoRR abs/2110.07058 (2021) - [i4]Chunxi Liu, Michael Picheny, Leda Sari, Pooja Chitkara, Alex Xiao, Xiaohui Zhang, Mark Chou, Andres Alvarado, Caner Hazirbas, Yatharth Saraf:
Towards Measuring Fairness in Speech Recognition: Casual Conversations Dataset Transcriptions. CoRR abs/2111.09983 (2021) - 2020
- [c13]Leda Sari, Niko Moritz, Takaaki Hori, Jonathan Le Roux:
Unsupervised Speaker Adaptation Using Attention-Based Speaker Memory for End-to-End ASR. ICASSP 2020: 7384-7388 - [c12]Leda Sari, Samuel Thomas, Mark Hasegawa-Johnson:
Training Spoken Language Understanding Systems with Non-Parallel Speech and Text. ICASSP 2020: 8109-8113 - [c11]Leda Sari, Mark Hasegawa-Johnson:
Deep F-Measure Maximization for End-to-End Speech Understanding. INTERSPEECH 2020: 1580-1584 - [c10]Junzhe Zhu, Mark Hasegawa-Johnson, Leda Sari:
Identify Speakers in Cocktail Parties with End-to-End Attention. INTERSPEECH 2020: 3092-3096 - [i3]Leda Sari, Niko Moritz, Takaaki Hori, Jonathan Le Roux:
Unsupervised Speaker Adaptation using Attention-based Speaker Memory for End-to-End ASR. CoRR abs/2002.06165 (2020) - [i2]Junzhe Zhu, Mark Hasegawa-Johnson, Leda Sari:
Identify Speakers in Cocktail Parties with End-to-End Attention. CoRR abs/2005.11408 (2020) - [i1]Leda Sari, Mark Hasegawa-Johnson:
Deep F-measure Maximization for End-to-End Speech Understanding. CoRR abs/2008.03425 (2020)
2010 – 2019
- 2019
- [c9]Leda Sari, Samuel Thomas, Mark Hasegawa-Johnson, Michael Picheny:
Pre-training of Speaker Embeddings for Low-latency Speaker Change Detection in Broadcast News. ICASSP 2019: 6286-6290 - [c8]Leda Sari, Samuel Thomas, Mark A. Hasegawa-Johnson:
Learning Speaker Aware Offsets for Speaker Adaptation of Neural Networks. INTERSPEECH 2019: 769-773 - 2018
- [c7]Leda Sari, Mark Hasegawa-Johnson, Kumaran S, Georg Stemmer, Krishnakumar N. Nair:
Speaker Adaptive Audio-Visual Fusion for the Open-Vocabulary Section of AVICAR. INTERSPEECH 2018: 3524-3528 - 2016
- [c6]Leda Sari, Murat Saraclar:
Score normalization for keyword search. SIU 2016: 761-764 - [c5]Batuhan Gündogdu, Leda Sari, Gozde Cetinkaya, Murat Saraclar:
Template-based Keyword Search with pseudo posteriorgrams. SIU 2016: 973-976 - 2015
- [c4]Leda Sari, Batuhan Gündogdu, Murat Saraçlar:
Fusion of LVCSR and posteriorgram based keyword search. INTERSPEECH 2015: 824-828 - [c3]Leda Sari, Murat Saraclar:
Discriminative training of the keyword search confusion model. SIU 2015: 1175-1178 - [c2]Leda Sari, Batuhan Gündogdu, Murat Saraclar:
Posteriorgram based approaches in keyword search. SIU 2015: 1183-1186 - 2014
- [c1]Leda Sari, Aysin Ertüzün:
Texture Defect Detection Using Independent Vector Analysis in Wavelet Domain. ICPR 2014: 1639-1644
Coauthor Index
aka: Mark A. Hasegawa-Johnson
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-11-15 20:35 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint