default search action
Sefik Emre Eskimez
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [j6]Xiaofei Wang, Manthan Thakker, Zhuo Chen, Naoyuki Kanda, Sefik Emre Eskimez, Sanyuan Chen, Min Tang, Shujie Liu, Jinyu Li, Takuya Yoshioka:
SpeechX: Neural Codec Language Model as a Versatile Speech Transformer. IEEE ACM Trans. Audio Speech Lang. Process. 32: 3355-3364 (2024) - [i25]Naoyuki Kanda, Xiaofei Wang, Sefik Emre Eskimez, Manthan Thakker, Hemin Yang, Zirun Zhu, Min Tang, Canrun Li, Chung-Hsien Tsai, Zhen Xiao, Yufei Xia, Jinzhu Li, Yanqing Liu, Sheng Zhao, Michael Zeng:
Making Flow-Matching-Based Zero-Shot Text-to-Speech Laugh as You Like. CoRR abs/2402.07383 (2024) - [i24]Xiaofei Wang, Sefik Emre Eskimez, Manthan Thakker, Hemin Yang, Zirun Zhu, Min Tang, Yufei Xia, Jinzhu Li, Sheng Zhao, Jinyu Li, Naoyuki Kanda:
An Investigation of Noise Robustness for Flow-Matching-Based Zero-Shot TTS. CoRR abs/2406.05699 (2024) - [i23]Sefik Emre Eskimez, Xiaofei Wang, Manthan Thakker, Canrun Li, Chung-Hsien Tsai, Zhen Xiao, Hemin Yang, Zirun Zhu, Min Tang, Xu Tan, Yanqing Liu, Sheng Zhao, Naoyuki Kanda:
E2 TTS: Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS. CoRR abs/2406.18009 (2024) - [i22]Vidya Srinivas, Malek Itani, Tuochao Chen, Sefik Emre Eskimez, Takuya Yoshioka, Shyamnath Gollakota:
Knowledge boosting during low-latency inference. CoRR abs/2407.11055 (2024) - [i21]Tuochao Chen, Qirui Wang, Bohan Wu, Malek Itani, Sefik Emre Eskimez, Takuya Yoshioka, Shyamnath Gollakota:
Target conversation extraction: Source separation using turn-taking dynamics. CoRR abs/2407.11277 (2024) - [i20]Haibin Wu, Xiaofei Wang, Sefik Emre Eskimez, Manthan Thakker, Daniel Tompkins, Chung-Hsien Tsai, Canrun Li, Zhen Xiao, Sheng Zhao, Jinyu Li, Naoyuki Kanda:
Laugh Now Cry Later: Controlling Time-Varying Emotional States of Flow-Matching-Based Zero-Shot Text-to-Speech. CoRR abs/2407.12229 (2024) - 2023
- [j5]Junwei Liao, Sefik Emre Eskimez, Liyang Lu, Yu Shi, Ming Gong, Linjun Shou, Hong Qu, Michael Zeng:
Improving Readability for Automatic Speech Recognition Transcription. ACM Trans. Asian Low Resour. Lang. Inf. Process. 22(5): 142:1-142:23 (2023) - [c25]Zhuo Chen, Naoyuki Kanda, Jian Wu, Yu Wu, Xiaofei Wang, Takuya Yoshioka, Jinyu Li, Sunit Sivasankaran, Sefik Emre Eskimez:
Speech Separation with Large-Scale Self-Supervised Learning. ICASSP 2023: 1-5 - [c24]Hassan Taherian, Sefik Emre Eskimez, Takuya Yoshioka:
Breaking the Trade-Off in Personalized Speech Enhancement With Cross-Task Knowledge Distillation. ICASSP 2023: 1-5 - [c23]Zirun Zhu, Hemin Yang, Min Tang, Ziyi Yang, Sefik Emre Eskimez, Huaming Wang:
Real-Time Audio-Visual End-To-End Speech Enhancement. ICASSP 2023: 1-5 - [c22]Sefik Emre Eskimez, Takuya Yoshioka, Alex Ju, Min Tang, Tanel Pärnamaa, Huaming Wang:
Real-Time Joint Personalized Speech Enhancement and Acoustic Echo Cancellation. INTERSPEECH 2023: 1050-1054 - [i19]Xiaofei Wang, Manthan Thakker, Zhuo Chen, Naoyuki Kanda, Sefik Emre Eskimez, Sanyuan Chen, Min Tang, Shujie Liu, Jinyu Li, Takuya Yoshioka:
SpeechX: Neural Codec Language Model as a Versatile Speech Transformer. CoRR abs/2308.06873 (2023) - 2022
- [j4]Sefik Emre Eskimez, You Zhang, Zhiyao Duan:
Speech Driven Talking Face Generation From a Single Image and an Emotion Condition. IEEE Trans. Multim. 24: 3480-3490 (2022) - [c21]Hassan Taherian, Sefik Emre Eskimez, Takuya Yoshioka, Huaming Wang, Zhuo Chen, Xuedong Huang:
One Model to Enhance Them All: Array Geometry Agnostic Multi-Channel Personalized Speech Enhancement. ICASSP 2022: 271-275 - [c20]Sefik Emre Eskimez, Takuya Yoshioka, Huaming Wang, Xiaofei Wang, Zhuo Chen, Xuedong Huang:
Personalized speech enhancement: new models and Comprehensive evaluation. ICASSP 2022: 356-360 - [c19]Zhuohuang Zhang, Takuya Yoshioka, Naoyuki Kanda, Zhuo Chen, Xiaofei Wang, Dongmei Wang, Sefik Emre Eskimez:
All-Neural Beamformer for Continuous Speech Separation. ICASSP 2022: 6032-6036 - [c18]Harishchandra Dubey, Vishak Gopal, Ross Cutler, Ashkan Aazami, Sergiy Matusevych, Sebastian Braun, Sefik Emre Eskimez, Manthan Thakker, Takuya Yoshioka, Hannes Gamper, Robert Aichner:
Icassp 2022 Deep Noise Suppression Challenge. ICASSP 2022: 9271-9275 - [c17]Manthan Thakker, Sefik Emre Eskimez, Takuya Yoshioka, Huaming Wang:
Fast Real-time Personalized Speech Enhancement: End-to-End Enhancement Network (E3Net) and Knowledge Distillation. INTERSPEECH 2022: 991-995 - [c16]Xiaofei Wang, Dongmei Wang, Naoyuki Kanda, Sefik Emre Eskimez, Takuya Yoshioka:
Leveraging Real Conversational Data for Multi-Channel Continuous Speech Separation. INTERSPEECH 2022: 3814-3818 - [c15]Wangyou Zhang, Zhuo Chen, Naoyuki Kanda, Shujie Liu, Jinyu Li, Sefik Emre Eskimez, Takuya Yoshioka, Xiong Xiao, Zhong Meng, Yanmin Qian, Furu Wei:
Separating Long-Form Speech with Group-wise Permutation Invariant Training. INTERSPEECH 2022: 5383-5387 - [i18]Harishchandra Dubey, Vishak Gopal, Ross Cutler, Ashkan Aazami, Sergiy Matusevych, Sebastian Braun, Sefik Emre Eskimez, Manthan Thakker, Takuya Yoshioka, Hannes Gamper, Robert Aichner:
ICASSP 2022 Deep Noise Suppression Challenge. CoRR abs/2202.13288 (2022) - [i17]Manthan Thakker, Sefik Emre Eskimez, Takuya Yoshioka, Huaming Wang:
Fast Real-time Personalized Speech Enhancement: End-to-End Enhancement Network (E3Net) and Knowledge Distillation. CoRR abs/2204.00771 (2022) - [i16]Xiaofei Wang, Dongmei Wang, Naoyuki Kanda, Sefik Emre Eskimez, Takuya Yoshioka:
Leveraging Real Conversational Data for Multi-Channel Continuous Speech Separation. CoRR abs/2204.03232 (2022) - [i15]Sefik Emre Eskimez, Takuya Yoshioka, Alex Ju, Min Tang, Tanel Pärnamaa, Huaming Wang:
Real-Time Joint Personalized Speech Enhancement and Acoustic Echo Cancellation with E3Net. CoRR abs/2211.02773 (2022) - [i14]Hassan Taherian, Sefik Emre Eskimez, Takuya Yoshioka:
Breaking the trade-off in personalized speech enhancement with cross-task knowledge distillation. CoRR abs/2211.02944 (2022) - [i13]Zhuo Chen, Naoyuki Kanda, Jian Wu, Yu Wu, Xiaofei Wang, Takuya Yoshioka, Jinyu Li, Sunit Sivasankaran, Sefik Emre Eskimez:
Speech separation with large-scale self-supervised learning. CoRR abs/2211.05172 (2022) - 2021
- [c14]Junwei Liao, Yu Shi, Ming Gong, Linjun Shou, Sefik Emre Eskimez, Liyang Lu, Hong Qu, Michael Zeng:
Generating Human Readable Transcript for Automatic Speech Recognition with Pre-Trained Language Model. ICASSP 2021: 7578-7582 - [c13]Sefik Emre Eskimez, Dimitrios Dimitriadis, Ken'ichi Kumatani, Robert Gmyr:
One-Shot Voice Conversion with Speaker-Agnostic StarGAN. Interspeech 2021: 1334-1338 - [c12]Sefik Emre Eskimez, Xiaofei Wang, Min Tang, Hemin Yang, Zirun Zhu, Zhuo Chen, Huaming Wang, Takuya Yoshioka:
Human Listening and Live Captioning: Multi-Task Training for Speech Enhancement. Interspeech 2021: 2686-2690 - [i12]Junwei Liao, Yu Shi, Ming Gong, Linjun Shou, Sefik Emre Eskimez, Liyang Lu, Hong Qu, Michael Zeng:
Generating Human Readable Transcript for Automatic Speech Recognition with Pre-trained Language Model. CoRR abs/2102.11114 (2021) - [i11]Dimitrios Dimitriadis, Ken'ichi Kumatani, Robert Gmyr, Yashesh Gaur, Sefik Emre Eskimez:
Dynamic Gradient Aggregation for Federated Domain Adaptation. CoRR abs/2106.07578 (2021) - [i10]Zhuohuang Zhang, Takuya Yoshioka, Naoyuki Kanda, Zhuo Chen, Xiaofei Wang, Dongmei Wang, Sefik Emre Eskimez:
All-neural beamformer for continuous speech separation. CoRR abs/2110.06428 (2021) - [i9]Sefik Emre Eskimez, Takuya Yoshioka, Huaming Wang, Xiaofei Wang, Zhuo Chen, Xuedong Huang:
Personalized Speech Enhancement: New Models and Comprehensive Evaluation. CoRR abs/2110.09625 (2021) - [i8]Hassan Taherian, Sefik Emre Eskimez, Takuya Yoshioka, Huaming Wang, Zhuo Chen, Xuedong Huang:
One model to enhance them all: array geometry agnostic multi-channel personalized speech enhancement. CoRR abs/2110.10330 (2021) - [i7]Wangyou Zhang, Zhuo Chen, Naoyuki Kanda, Shujie Liu, Jinyu Li, Sefik Emre Eskimez, Takuya Yoshioka, Xiong Xiao, Zhong Meng, Yanmin Qian, Furu Wei:
Separating Long-Form Speech with Group-Wise Permutation Invariant Training. CoRR abs/2110.14142 (2021) - [i6]Ken'ichi Kumatani, Dimitrios Dimitriadis, Yashesh Gaur, Robert Gmyr, Sefik Emre Eskimez, Jinyu Li, Michael Zeng:
Sequence-level self-learning with multiple hypotheses. CoRR abs/2112.05826 (2021) - 2020
- [j3]Sefik Emre Eskimez, Ross K. Maddox, Chenliang Xu, Zhiyao Duan:
Noise-Resilient Training Method for Face Landmark Generation From Speech. IEEE ACM Trans. Audio Speech Lang. Process. 28: 27-38 (2020) - [c11]Sefik Emre Eskimez, Ross K. Maddox, Chenliang Xu, Zhiyao Duan:
End-To-End Generation of Talking Faces from Noisy Speech. ICASSP 2020: 1948-1952 - [c10]Dimitrios Dimitriadis, Ken'ichi Kumatani, Robert Gmyr, Yashesh Gaur, Sefik Emre Eskimez:
A Federated Approach in Training Acoustic Models. INTERSPEECH 2020: 981-985 - [c9]Sefik Emre Eskimez, Dimitrios Dimitriadis, Robert Gmyr, Kenichi Kumanati:
GAN-Based Data Generation for Speech Emotion Recognition. INTERSPEECH 2020: 3446-3450 - [c8]Ken'ichi Kumatani, Dimitrios Dimitriadis, Yashesh Gaur, Robert Gmyr, Sefik Emre Eskimez, Jinyu Li, Michael Zeng:
Sequence-Level Self-Learning with Multiple Hypotheses. INTERSPEECH 2020: 3775-3779 - [i5]Junwei Liao, Sefik Emre Eskimez, Liyang Lu, Yu Shi, Ming Gong, Linjun Shou, Hong Qu, Michael Zeng:
Improving Readability for Automatic Speech Recognition Transcription. CoRR abs/2004.04438 (2020) - [i4]Dimitrios Dimitriadis, Ken'ichi Kumatani, Robert Gmyr, Yashesh Gaur, Sefik Emre Eskimez:
Federated Transfer Learning with Dynamic Gradient Aggregation. CoRR abs/2008.02452 (2020) - [i3]Sefik Emre Eskimez, You Zhang, Zhiyao Duan:
Speech Driven Talking Face Generation from a Single Image and an Emotion Condition. CoRR abs/2008.03592 (2020)
2010 – 2019
- 2019
- [j2]Sefik Emre Eskimez, Kazuhito Koishida, Zhiyao Duan:
Adversarial Training for Speech Super-Resolution. IEEE J. Sel. Top. Signal Process. 13(2): 347-358 (2019) - [c7]Sefik Emre Eskimez, Kazuhito Koishida:
Speech Super Resolution Generative Adversarial Network. ICASSP 2019: 3717-3721 - 2018
- [j1]Sefik Emre Eskimez, Peter Soufleris, Zhiyao Duan, Wendi B. Heinzelman:
Front-end speech enhancement for commercial speaker verification systems. Speech Commun. 99: 101-113 (2018) - [c6]Sefik Emre Eskimez, Ross K. Maddox, Chenliang Xu, Zhiyao Duan:
Generating Talking Face Landmarks from Speech. LVA/ICA 2018: 372-381 - [c5]Sefik Emre Eskimez, Zhiyao Duan, Wendi B. Heinzelman:
Unsupervised Learning Approach to Feature Analysis for Automatic Speech Emotion Recognition. ICASSP 2018: 5099-5103 - [i2]Sefik Emre Eskimez, Ross K. Maddox, Chenliang Xu, Zhiyao Duan:
Generating Talking Face Landmarks from Speech. CoRR abs/1803.09803 (2018) - 2016
- [c4]Sefik Emre Eskimez, Kenneth Imade, Na Yang, Melissa Sturge-Apple, Zhiyao Duan, Wendi B. Heinzelman:
Emotion classification: How does an automated system compare to Naive human coders? ICASSP 2016: 2274-2278 - [c3]Sefik Emre Eskimez, Melissa Sturge-Apple, Zhiyao Duan, Wendi B. Heinzelman:
WISE: Web-based Interactive Speech Emotion Classification. SAAIP@IJCAI 2016: 2-7 - 2015
- [i1]Sefik Emre Eskimez, Kenneth Imade, Na Yang, Melissa Sturge-Apple, Zhiyao Duan, Wendi B. Heinzelman:
Emotion Classification: How Does an Automated System Compare to Naive Human Coders? CoRR abs/1510.06769 (2015) - 2013
- [c2]Selim Ozel, Sefik Emre Eskimez, Kemalettin Erbatur:
Humanoid robot orientation stabilization by shoulder joint motion during locomotion. ASCC 2013: 1-6 - 2012
- [c1]Tunc Akbas, Sefik Emre Eskimez, Selim Ozel, Omer Kemal Adak, Kaan Can Fidan, Kemalettin Erbatur:
Zero Moment Point based pace reference generation for quadruped robots via preview control. AMC 2012: 1-7
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-11-19 21:48 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint