default search action
Dan Su 0002
Person information
- affiliation: Tencent AI Lab, Shenzhen, China
Other persons with the same name
- Dan Su — disambiguation page
- Dan Su 0001 — City University of Hong Kong, Department of Mechanical Engineering
- Dan Su 0003 — Hong Kong University of Science and Technology
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [c87]Yongxin Zhu, Dan Su, Liqiang He, Linli Xu, Dong Yu:
Generative Pre-trained Speech Language Model with Efficient Hierarchical Transformer. ACL (1) 2024: 1764-1775 - [c86]Duzhen Zhang, Yahan Yu, Jiahua Dong, Chenxing Li, Dan Su, Chenhui Chu, Dong Yu:
MM-LLMs: Recent Advances in MultiModal Large Language Models. ACL (Findings) 2024: 12401-12430 - [c85]Yu Gu, Qiushi Zhu, Guangzhi Lei, Chao Weng, Dan Su:
DurIAN-E 2: Duration Informed Attention Network with Adaptive Variational Autoencoder and Adversarial Learning for Expressive Text-to-Speech Synthesis. ICASSP 2024: 11266-11270 - [c84]Manjie Xu, Chenxing Li, Duzhen Zhang, Dan Su, Wei Liang, Dong Yu:
Prompt-guided Precise Audio Editing with Diffusion Models. ICML 2024 - [i62]Duzhen Zhang, Yahan Yu, Chenxing Li, Jiahua Dong, Dan Su, Chenhui Chu, Dong Yu:
MM-LLMs: Recent Advances in MultiModal Large Language Models. CoRR abs/2401.13601 (2024) - [i61]Chong Peng, Liqiang He, Dan Su:
Fuse after Align: Improving Face-Voice Association Learning via Multimodal Encoder. CoRR abs/2404.09509 (2024) - [i60]Yongxin Zhu, Dan Su, Liqiang He, Linli Xu, Dong Yu:
Generative Pre-trained Speech Language Model with Efficient Hierarchical Transformer. CoRR abs/2406.00976 (2024) - [i59]Manjie Xu, Chenxing Li, Duzhen Zhang, Dan Su, Wei Liang, Dong Yu:
Prompt-guided Precise Audio Editing with Diffusion Models. CoRR abs/2406.04350 (2024) - 2023
- [c83]Yi Lei, Shan Yang, Xinsheng Wang, Qicong Xie, Jixun Yao, Lei Xie, Dan Su:
UniSyn: An End-to-End Unified Model for Text-to-Speech and Singing Voice Synthesis. AAAI 2023: 13025-13033 - [c82]Lixin Cao, Jun Wang, Ben Yang, Dan Su, Dong Yu:
Trinet: Stabilizing Self-Supervised Learning From Complete or Slow Collapse. ICASSP 2023: 1-5 - [c81]Wei Xiao, Wenzhe Liu, Meng Wang, Shan Yang, Yupeng Shi, Yuyong Kang, Dan Su, Shidong Shang, Dong Yu:
Multi-mode Neural Speech Coding Based on Deep Generative Networks. INTERSPEECH 2023: 819-823 - [c80]Jiaxu Zhu, Weinan Tong, Yaoxun Xu, Changhe Song, Zhiyong Wu, Zhao You, Dan Su, Dong Yu, Helen Meng:
Text-Only Domain Adaptation for End-to-End Speech Recognition through Down-Sampling Acoustic Representation. INTERSPEECH 2023: 1334-1338 - [c79]Yuping Yuan, Zhao You, Shulin Feng, Dan Su, Yanchun Liang, Xiaohu Shi, Dong Yu:
Compressed MoE ASR Model Based on Knowledge Distillation and Quantization. INTERSPEECH 2023: 3337-3341 - [i58]Lixin Cao, Jun Wang, Ben Yang, Dan Su, Dong Yu:
TriNet: stabilizing self-supervised learning from complete or slow collapse. CoRR abs/2301.00656 (2023) - [i57]Jiaxu Zhu, Weinan Tong, Yaoxun Xu, Changhe Song, Zhiyong Wu, Zhao You, Dan Su, Dong Yu, Helen M. Meng:
Text-Only Domain Adaptation for End-to-End Speech Recognition through Down-Sampling Acoustic Representation. CoRR abs/2309.02459 (2023) - [i56]Yu Gu, Yianrao Bian, Guangzhi Lei, Chao Weng, Dan Su:
DurIAN-E: Duration Informed Attention Network For Expressive Text-to-Speech Synthesis. CoRR abs/2309.12792 (2023) - [i55]Wenzhe Liu, Wei Xiao, Meng Wang, Shan Yang, Yupeng Shi, Yuyong Kang, Dan Su, Shidong Shang, Dong Yu:
A High Fidelity and Low Complexity Neural Audio Coding. CoRR abs/2310.10992 (2023) - 2022
- [j4]Yi Lei, Shan Yang, Xinfa Zhu, Lei Xie, Dan Su:
Cross-Speaker Emotion Transfer Through Information Perturbation in Emotional Speech Synthesis. IEEE Signal Process. Lett. 29: 1948-1952 (2022) - [c78]Songxiang Liu, Shan Yang, Dan Su, Dong Yu:
Referee: Towards Reference-Free Cross-Speaker Style Transfer with Low-Quality Data for Expressive Speech Synthesis. ICASSP 2022: 6307-6311 - [c77]Xiaoyi Qin, Na Li, Chao Weng, Dan Su, Ming Li:
Simple Attention Module Based Speaker Verification with Iterative Noisy Label Detection. ICASSP 2022: 6722-6726 - [c76]Zhao You, Shulin Feng, Dan Su, Dong Yu:
Speechmoe2: Mixture-of-Experts Model with Improved Routing. ICASSP 2022: 7217-7221 - [c75]Disong Wang, Shan Yang, Dan Su, Xunying Liu, Dong Yu, Helen Meng:
VCVTS: Multi-Speaker Video-to-Speech Synthesis Via Cross-Modal Knowledge Transfer from Voice Conversion. ICASSP 2022: 7252-7256 - [c74]Naijun Zheng, Na Li, Jianwei Yu, Chao Weng, Dan Su, Xunying Liu, Helen Meng:
Multi-Channel Speaker Diarization Using Spatial Features for Meetings. ICASSP 2022: 7337-7341 - [c73]Dongpeng Ma, Yiwen Wang, Liqiang He, Mingjie Jin, Dan Su, Dong Yu:
DP-DWA: Dual-Path Dynamic Weight Attention Network With Streaming Dfsmn-San For Automatic Speech Recognition. ICASSP 2022: 7692-7696 - [c72]Jinchuan Tian, Jianwei Yu, Chao Weng, Shi-Xiong Zhang, Dan Su, Dong Yu, Yuexian Zou:
Consistent Training and Decoding for End-to-End Speech Recognition Using Lattice-Free MMI. ICASSP 2022: 7782-7786 - [c71]Jingbei Li, Yi Meng, Chenyi Li, Zhiyong Wu, Helen Meng, Chao Weng, Dan Su:
Enhancing Speaking Styles in Conversational Text-to-Speech Synthesis with Graph-Based Multi-Modal Context Modeling. ICASSP 2022: 7917-7921 - [c70]Naijun Zheng, Na Li, Xixin Wu, Lingwei Meng, Jiawen Kang, Haibin Wu, Chao Weng, Dan Su, Helen Meng:
The CUHK-Tencent Speaker Diarization System for the ICASSP 2022 Multi-Channel Multi-Party Meeting Transcription Challenge. ICASSP 2022: 9161-9165 - [c69]Max W. Y. Lam, Jun Wang, Dan Su, Dong Yu:
BDDM: Bilateral Denoising Diffusion Models for Fast and High-Quality Speech Synthesis. ICLR 2022 - [c68]Rongjie Huang, Max W. Y. Lam, Jun Wang, Dan Su, Dong Yu, Yi Ren, Zhou Zhao:
FastDiff: A Fast Conditional Diffusion Model for High-Quality Speech Synthesis. IJCAI 2022: 4157-4163 - [c67]Xiaoyi Qin, Na Li, Chao Weng, Dan Su, Ming Li:
Cross-Age Speaker Verification: Learning Age-Invariant Speaker Embeddings. INTERSPEECH 2022: 1436-1440 - [c66]Liumeng Xue, Shan Yang, Na Hu, Dan Su, Lei Xie:
Learning Noise-independent Speech Representation for High-quality Voice Conversion for Noisy Target Speakers. INTERSPEECH 2022: 2548-2552 - [c65]Yi Lei, Shan Yang, Jian Cong, Lei Xie, Dan Su:
Glow-WaveGAN 2: High-quality Zero-shot Text-to-speech Synthesis and Any-to-any Voice Conversion. INTERSPEECH 2022: 2563-2567 - [c64]Yixuan Zhou, Changhe Song, Xiang Li, Luwen Zhang, Zhiyong Wu, Yanyao Bian, Dan Su, Helen Meng:
Content-Dependent Fine-Grained Speaker Embedding for Zero-Shot Speaker Adaptation in Text-to-Speech Synthesis. INTERSPEECH 2022: 2573-2577 - [c63]Yixuan Zhou, Changhe Song, Jingbei Li, Zhiyong Wu, Yanyao Bian, Dan Su, Helen Meng:
Enhancing Word-Level Semantic Representation via Dependency Structure for Expressive Text-to-Speech Synthesis. INTERSPEECH 2022: 5518-5522 - [c62]Qicong Xie, Shan Yang, Yi Lei, Lei Xie, Dan Su:
End-to-End Voice Conversion with Information Perturbation. ISCSLP 2022: 91-95 - [c61]Zhao You, Shulin Feng, Dan Su, Dong Yu:
3M: Multi-loss, Multi-path and Multi-level Neural Networks for speech recognition. ISCSLP 2022: 170-174 - [c60]Kun Song, Heyang Xue, Xinsheng Wang, Jian Cong, Yongmao Zhang, Lei Xie, Bing Yang, Xiong Zhang, Dan Su:
AdaVITS: Tiny VITS for Low Computing Resource Speaker Adaptation. ISCSLP 2022: 319-323 - [i54]Songxiang Liu, Dan Su, Dong Yu:
DiffGAN-TTS: High-Fidelity and Efficient Text-to-Speech with Denoising Diffusion GANs. CoRR abs/2201.11972 (2022) - [i53]Naijun Zheng, Na Li, Xixin Wu, Lingwei Meng, Jiawen Kang, Haibin Wu, Chao Weng, Dan Su, Helen Meng:
The CUHK-TENCENT speaker diarization system for the ICASSP 2022 multi-channel multi-party meeting transcription challenge. CoRR abs/2202.01986 (2022) - [i52]Disong Wang, Shan Yang, Dan Su, Xunying Liu, Dong Yu, Helen Meng:
VCVTS: Multi-speaker Video-to-Speech synthesis via cross-modal knowledge transfer from voice conversion. CoRR abs/2202.09081 (2022) - [i51]Max W. Y. Lam, Jun Wang, Dan Su, Dong Yu:
BDDM: Bilateral Denoising Diffusion Models for Fast and High-Quality Speech Synthesis. CoRR abs/2203.13508 (2022) - [i50]Yixuan Zhou, Changhe Song, Xiang Li, Luwen Zhang, Zhiyong Wu, Yanyao Bian, Dan Su, Helen Meng:
Content-Dependent Fine-Grained Speaker Embedding for Zero-Shot Speaker Adaptation in Text-to-Speech Synthesis. CoRR abs/2204.00990 (2022) - [i49]Zhao You, Shulin Feng, Dan Su, Dong Yu:
3M: Multi-loss, Multi-path and Multi-level Neural Networks for speech recognition. CoRR abs/2204.03178 (2022) - [i48]Rongjie Huang, Max W. Y. Lam, Jun Wang, Dan Su, Dong Yu, Yi Ren, Zhou Zhao:
FastDiff: A Fast Conditional Diffusion Model for High-Quality Speech Synthesis. CoRR abs/2204.09934 (2022) - [i47]Kun Song, Heyang Xue, Xinsheng Wang, Jian Cong, Yongmao Zhang, Lei Xie, Bing Yang, Xiong Zhang, Dan Su:
AdaVITS: Tiny VITS for Low Computing Resource Speaker Adaptation. CoRR abs/2206.00208 (2022) - [i46]Qicong Xie, Shan Yang, Yi Lei, Lei Xie, Dan Su:
End-to-End Voice Conversion with Information Perturbation. CoRR abs/2206.07569 (2022) - [i45]Liumeng Xue, Shan Yang, Na Hu, Dan Su, Lei Xie:
Learning Noise-independent Speech Representation for High-quality Voice Conversion for Noisy Target Speakers. CoRR abs/2207.00756 (2022) - [i44]Yi Lei, Shan Yang, Jian Cong, Lei Xie, Dan Su:
Glow-WaveGAN 2: High-quality Zero-shot Text-to-speech Synthesis and Any-to-any Voice Conversion. CoRR abs/2207.01832 (2022) - [i43]Xiaoyi Qin, Na Li, Chao Weng, Dan Su, Ming Li:
Cross-Age Speaker Verification: Learning Age-Invariant Speaker Embeddings. CoRR abs/2207.05929 (2022) - [i42]Xiaoyi Qin, Na Li, Yuke Lin, Yiwei Ding, Chao Weng, Dan Su, Ming Li:
The DKU-Tencent System for the VoxCeleb Speaker Recognition Challenge 2022. CoRR abs/2210.05092 (2022) - [i41]Yi Lei, Shan Yang, Xinsheng Wang, Qicong Xie, Jixun Yao, Lei Xie, Dan Su:
UniSyn: An End-to-End Unified Model for Text-to-Speech and Singing Voice Synthesis. CoRR abs/2212.01546 (2022) - 2021
- [c59]Jun Wang, Max W. Y. Lam, Dan Su, Dong Yu:
Tune-In: Training Under Negative Environments with Interference for Attention Networks Simulating Cocktail Party Effect. AAAI 2021: 13961-13969 - [c58]Huirong Huang, Zhiyong Wu, Shiyin Kang, Dongyang Dai, Jia Jia, Tianxiao Fu, Deyi Tuo, Guangzhi Lei, Peng Liu, Dan Su, Dong Yu, Helen Meng:
Speaker Independent and Multilingual/Mixlingual Speech-Driven Talking Head Generation Using Phonetic Posteriorgrams. APSIPA ASC 2021: 1433-1437 - [c57]Liqiang He, Shulin Feng, Dan Su, Dong Yu:
Latency-Controlled Neural Architecture Search for Streaming Speech Recognition. ASRU 2021: 62-67 - [c56]Songxiang Liu, Yuewen Cao, Dan Su, Helen Meng:
DiffSVC: A Diffusion Probabilistic Model for Singing Voice Conversion. ASRU 2021: 741-748 - [c55]Jun Wang, Max W. Y. Lam, Dan Su, Dong Yu:
Contrastive Separative Coding for Self-Supervised Representation Learning. ICASSP 2021: 3865-3869 - [c54]Max W. Y. Lam, Jun Wang, Dan Su, Dong Yu:
Sandglasset: A Light Multi-Granularity Self-Attentive Network for Time-Domain Speech Separation. ICASSP 2021: 5759-5763 - [c53]Xingchen Song, Zhiyong Wu, Yiheng Huang, Chao Weng, Dan Su, Helen M. Meng:
Non-Autoregressive Transformer ASR with CTC-Enhanced Decoder Input. ICASSP 2021: 5894-5898 - [c52]Xu Li, Na Li, Chao Weng, Xunying Liu, Dan Su, Dong Yu, Helen Meng:
Replay and Synthetic Speech Detection with Res2Net Architecture. ICASSP 2021: 6354-6358 - [c51]Naijun Zheng, Na Li, Bo Wu, Meng Yu, Jianwei Yu, Chao Weng, Dan Su, Xunying Liu, Helen Meng:
A Joint Training Framework of Multi-Look Separator and Speaker Embedding Extractor for Overlapped Speech. ICASSP 2021: 6698-6702 - [c50]Liqiang He, Dan Su, Dong Yu:
Learned Transferable Architectures Can Surpass Hand-Designed Architectures for Large Scale Speech Recognition. ICASSP 2021: 6788-6792 - [c49]Songxiang Liu, Yuewen Cao, Na Hu, Dan Su, Helen Meng:
Fastsvc: Fast Cross-Domain Singing Voice Conversion With Feature-Wise Linear Modulation. ICME 2021: 1-6 - [c48]Yi Chen, Shan Yang, Na Hu, Lei Xie, Dan Su:
TeNC: Low Bit-Rate Speech Coding with VQ-VAE and GAN. ICMI Companion 2021: 126-130 - [c47]Max W. Y. Lam, Jun Wang, Chao Weng, Dan Su, Dong Yu:
Raw Waveform Encoder with Multi-Scale Globally Attentive Locally Recurrent Networks for End-to-End Speech Recognition. Interspeech 2021: 316-320 - [c46]Helin Wang, Bo Wu, Lianwu Chen, Meng Yu, Jianwei Yu, Yong Xu, Shi-Xiong Zhang, Chao Weng, Dan Su, Dong Yu:
TeCANet: Temporal-Contextual Attention Network for Environment-Aware Speech Dereverberation. Interspeech 2021: 1109-1113 - [c45]Zhao You, Shulin Feng, Dan Su, Dong Yu:
SpeechMoE: Scaling to Large Acoustic Models with Dynamic Routing Mixture of Experts. Interspeech 2021: 2077-2081 - [c44]Jian Cong, Shan Yang, Lei Xie, Dan Su:
Glow-WaveGAN: Learning Speech Representations from GAN-Based Variational Auto-Encoder for High Fidelity Flow-Based Speech Synthesis. Interspeech 2021: 2182-2186 - [c43]Guoguo Chen, Shuzhou Chai, Guan-Bo Wang, Jiayu Du, Wei-Qiang Zhang, Chao Weng, Dan Su, Daniel Povey, Jan Trmal, Junbo Zhang, Mingjie Jin, Sanjeev Khudanpur, Shinji Watanabe, Shuaijiang Zhao, Wei Zou, Xiangang Li, Xuchen Yao, Yongqing Wang, Zhao You, Zhiyong Yan:
GigaSpeech: An Evolving, Multi-Domain ASR Corpus with 10, 000 Hours of Transcribed Audio. Interspeech 2021: 3670-3674 - [c42]Jian Cong, Shan Yang, Na Hu, Guangzhi Li, Lei Xie, Dan Su:
Controllable Context-Aware Conversational Speech Synthesis. Interspeech 2021: 4658-4662 - [c41]Yuewen Cao, Songxiang Liu, Shiyin Kang, Na Hu, Peng Liu, Xunying Liu, Dan Su, Dong Yu, Helen Meng:
Exploring Cross-lingual Singing Voice Synthesis Using Speech Data. ISCSLP 2021: 1-5 - [c40]Max W. Y. Lam, Jun Wang, Dan Su, Dong Yu:
Effective Low-Cost Time-Domain Audio Separation Using Globally Attentive Locally Recurrent Networks. SLT 2021: 801-808 - [i40]Max W. Y. Lam, Jun Wang, Dan Su, Dong Yu:
Effective Low-Cost Time-Domain Audio Separation Using Globally Attentive Locally Recurrent Networks. CoRR abs/2101.05014 (2021) - [i39]Peng Liu, Yuewen Cao, Songxiang Liu, Na Hu, Guangzhi Li, Chao Weng, Dan Su:
VARA-TTS: Non-Autoregressive Text-to-Speech Synthesis based on Very Deep VAE with Residual Attention. CoRR abs/2102.06431 (2021) - [i38]Jun Wang, Max W. Y. Lam, Dan Su, Dong Yu:
Contrastive Separative Coding for Self-supervised Representation Learning. CoRR abs/2103.00816 (2021) - [i37]Max W. Y. Lam, Jun Wang, Dan Su, Dong Yu:
Sandglasset: A Light Multi-Granularity Self-attentive Network For Time-Domain Speech Separation. CoRR abs/2103.00819 (2021) - [i36]Jun Wang, Max W. Y. Lam, Dan Su, Dong Yu:
Tune-In: Training Under Negative Environments with Interference for Attention Networks Simulating Cocktail Party Effect. CoRR abs/2103.01461 (2021) - [i35]Helin Wang, Bo Wu, Lianwu Chen, Meng Yu, Jianwei Yu, Yong Xu, Shi-Xiong Zhang, Chao Weng, Dan Su, Dong Yu:
TeCANet: Temporal-Contextual Attention Network for Environment-Aware Speech Dereverberation. CoRR abs/2103.16849 (2021) - [i34]Zhao You, Shulin Feng, Dan Su, Dong Yu:
SpeechMoE: Scaling to Large Acoustic Models with Dynamic Routing Mixture of Experts. CoRR abs/2105.03036 (2021) - [i33]Liqiang He, Shulin Feng, Dan Su, Dong Yu:
Latency-Controlled Neural Architecture Search for Streaming Speech Recognition. CoRR abs/2105.03643 (2021) - [i32]Songxiang Liu, Yuewen Cao, Dan Su, Helen Meng:
DiffSVC: A Diffusion Probabilistic Model for Singing Voice Conversion. CoRR abs/2105.13871 (2021) - [i31]Max W. Y. Lam, Jun Wang, Chao Weng, Dan Su, Dong Yu:
Raw Waveform Encoder with Multi-Scale Globally Attentive Locally Recurrent Networks for End-to-End Speech Recognition. CoRR abs/2106.04275 (2021) - [i30]Jingbei Li, Yi Meng, Chenyi Li, Zhiyong Wu, Helen Meng, Chao Weng, Dan Su:
Spoken Style Learning with Multi-modal Hierarchical Context Encoding for Conversational Text-to-Speech Synthesis. CoRR abs/2106.06233 (2021) - [i29]Guoguo Chen, Shuzhou Chai, Guanbo Wang, Jiayu Du, Wei-Qiang Zhang, Chao Weng, Dan Su, Daniel Povey, Jan Trmal, Junbo Zhang, Mingjie Jin, Sanjeev Khudanpur, Shinji Watanabe, Shuaijiang Zhao, Wei Zou, Xiangang Li, Xuchen Yao, Yongqing Wang, Yujun Wang, Zhao You, Zhiyong Yan:
GigaSpeech: An Evolving, Multi-domain ASR Corpus with 10, 000 Hours of Transcribed Audio. CoRR abs/2106.06909 (2021) - [i28]Jian Cong, Shan Yang, Na Hu, Guangzhi Li, Lei Xie, Dan Su:
Controllable Context-aware Conversational Speech Synthesis. CoRR abs/2106.10828 (2021) - [i27]Jian Cong, Shan Yang, Lei Xie, Dan Su:
Glow-WaveGAN: Learning Speech Representations from GAN-based Variational Auto-Encoder For High Fidelity Flow-based Speech Synthesis. CoRR abs/2106.10831 (2021) - [i26]Max W. Y. Lam, Jun Wang, Rongjie Huang, Dan Su, Dong Yu:
Bilateral Denoising Diffusion Models. CoRR abs/2108.11514 (2021) - [i25]Songxiang Liu, Shan Yang, Dan Su, Dong Yu:
Referee: Towards reference-free cross-speaker style transfer with low-quality data for expressive speech synthesis. CoRR abs/2109.03439 (2021) - [i24]Xiaoyi Qin, Na Li, Chao Weng, Dan Su, Ming Li:
Simple Attention Module based Speaker Verification with Iterative noisy label detection. CoRR abs/2110.06534 (2021) - [i23]Songxiang Liu, Dan Su, Dong Yu:
Meta-Voice: Fast few-shot style transfer for expressive voice cloning using meta learning. CoRR abs/2111.07218 (2021) - [i22]Zhao You, Shulin Feng, Dan Su, Dong Yu:
SpeechMoE2: Mixture-of-Experts Model with Improved Routing. CoRR abs/2111.11831 (2021) - [i21]Jinchuan Tian, Jianwei Yu, Chao Weng, Shi-Xiong Zhang, Dan Su, Dong Yu, Yuexian Zou:
Consistent Training and Decoding For End-to-end Speech Recognition Using Lattice-free MMI. CoRR abs/2112.02498 (2021) - 2020
- [j3]Shan Yang, Heng Lu, Shiyin Kang, Liumeng Xue, Jinba Xiao, Dan Su, Lei Xie, Dong Yu:
On the localness modeling for the self-attention based end-to-end speech synthesis. Neural Networks 125: 121-130 (2020) - [j2]Weiwei Lin, Man-Wai Mak, Na Li, Dan Su, Dong Yu:
A Framework for Adapting DNN Speaker Embedding Across Languages. IEEE ACM Trans. Audio Speech Lang. Process. 28: 2810-2822 (2020) - [c39]Songxiang Liu, Disong Wang, Yuewen Cao, Lifa Sun, Xixin Wu, Shiyin Kang, Zhiyong Wu, Xunying Liu, Dan Su, Dong Yu, Helen Meng:
End-To-End Accent Conversion Without Using Native Utterances. ICASSP 2020: 6289-6293 - [c38]Max W. Y. Lam, Jun Wang, Dan Su, Dong Yu:
Mixup-breakdown: A Consistency Training Method for Improving Generalization of Speech Separation Models. ICASSP 2020: 6374-6378 - [c37]Weiwei Lin, Man-Wai Mak, Na Li, Dan Su, Dong Yu:
Multi-Level Deep Neural Network Adaptation for Speaker Verification Using MMD and Consistency Regularization. ICASSP 2020: 6839-6843 - [c36]Xuan Ji, Meng Yu, Chunlei Zhang, Dan Su, Tao Yu, Xiaoyu Liu, Dong Yu:
Speaker-Aware Target Speaker Enhancement by Jointly Learning with Speaker Embedding Extraction. ICASSP 2020: 7294-7298 - [c35]Rongzhi Gu, Shi-Xiong Zhang, Lianwu Chen, Yong Xu, Meng Yu, Dan Su, Yuexian Zou, Dong Yu:
Enhancing End-to-End Multi-Channel Speech Separation Via Spatial Feature Learning. ICASSP 2020: 7319-7323 - [c34]Xuan Ji, Meng Yu, Jie Chen, Jimeng Zheng, Dan Su, Dong Yu:
Integration of Multi-Look Beamformers for Multi-Channel Keyword Spotting. ICASSP 2020: 7464-7468 - [c33]Yuewen Cao, Songxiang Liu, Xixin Wu, Shiyin Kang, Peng Liu, Zhiyong Wu, Xunying Liu, Dan Su, Dong Yu, Helen Meng:
Code-Switched Speech Synthesis Using Bilingual Phonetic Posteriorgram with Only Monolingual Corpora. ICASSP 2020: 7619-7623 - [c32]Zhao You, Dan Su, Jie Chen, Chao Weng, Dong Yu:
Dfsmn-San with Persistent Memory Model for Automatic Speech Recognition. ICASSP 2020: 7704-7708 - [c31]Yiheng Huang, Jinchuan Tian, Lei Han, Guangsen Wang, Xingcheng Song, Dan Su, Dong Yu:
A Random Gossip BMUF Process for Neural Language Modeling. ICASSP 2020: 7959-7963 - [c30]Meng Yu, Xuan Ji, Bo Wu, Dan Su, Dong Yu:
End-to-End Multi-Look Keyword Spotting. INTERSPEECH 2020: 66-70 - [c29]Xingcheng Song, Zhiyong Wu, Yiheng Huang, Dan Su, Helen Meng:
SpecSwap: A Simple Data Augmentation Method for End-to-End Speech Recognition. INTERSPEECH 2020: 581-585 - [c28]Xu Li, Na Li, Jinghua Zhong, Xixin Wu, Xunying Liu, Dan Su, Dong Yu, Helen Meng:
Investigating Robustness of Adversarial Samples Detection for Automatic Speaker Verification. INTERSPEECH 2020: 1540-1544 - [c27]Chengzhu Yu, Heng Lu, Na Hu, Meng Yu, Chao Weng, Kun Xu, Peng Liu, Deyi Tuo, Shiyin Kang, Guangzhi Lei, Dan Su, Dong Yu:
DurIAN: Duration Informed Attention Network for Speech Synthesis. INTERSPEECH 2020: 2027-2031 - [c26]Jianwei Yu, Bo Wu, Rongzhi Gu, Shi-Xiong Zhang, Lianwu Chen, Yong Xu, Meng Yu, Dan Su, Dong Yu, Xunying Liu, Helen Meng:
Audio-Visual Multi-Channel Recognition of Overlapped Speech. INTERSPEECH 2020: 3496-3500 - [c25]Xingchen Song, Guangsen Wang, Yiheng Huang, Zhiyong Wu, Dan Su, Helen Meng:
Speech-XLNet: Unsupervised Acoustic Model Pretraining for Self-Attention Networks. INTERSPEECH 2020: 3765-3769 - [c24]Songxiang Liu, Yuewen Cao, Shiyin Kang, Na Hu, Xunying Liu, Dan Su, Dong Yu, Helen Meng:
Transferring Source Style in Non-Parallel Voice Conversion. INTERSPEECH 2020: 4721-4725 - [i20]Rongzhi Gu, Shi-Xiong Zhang, Lianwu Chen, Yong Xu, Meng Yu, Dan Su, Yuexian Zou, Dong Yu:
Enhancing End-to-End Multi-channel Speech Separation via Spatial Feature Learning. CoRR abs/2003.03927 (2020) - [i19]Jianwei Yu, Bo Wu, Rongzhi Gu, Shi-Xiong Zhang, Lianwu Chen, Yong Xu, Meng Yu, Dan Su, Dong Yu, Xunying Liu, Helen Meng:
Audio-visual Multi-channel Recognition of Overlapped Speech. CoRR abs/2005.08571 (2020) - [i18]Meng Yu, Xuan Ji, Bo Wu, Dan Su, Dong Yu:
End-to-End Multi-Look Keyword Spotting. CoRR abs/2005.10386 (2020) - [i17]Xu Li, Na Li, Jinghua Zhong, Xixin Wu, Xunying Liu, Dan Su, Dong Yu, Helen Meng:
Investigating Robustness of Adversarial Samples Detection for Automatic Speaker Verification. CoRR abs/2006.06186 (2020) - [i16]Huirong Huang, Zhiyong Wu, Shiyin Kang, Dongyang Dai, Jia Jia, Tianxiao Fu, Deyi Tuo, Guangzhi Lei, Peng Liu, Dan Su, Dong Yu, Helen Meng:
Speaker Independent and Multilingual/Mixlingual Speech-Driven Talking Head Generation Using Phonetic Posteriorgrams. CoRR abs/2006.11610 (2020) - [i15]Liqiang He, Dan Su, Dong Yu:
Learned Transferable Architectures Can Surpass Hand-Designed Architectures for Large Scale Speech Recognition. CoRR abs/2008.11589 (2020) - [i14]Xu Li, Na Li, Chao Weng, Xunying Liu, Dan Su, Dong Yu, Helen Meng:
Replay and Synthetic Speech Detection with Res2net Architecture. CoRR abs/2010.15006 (2020) - [i13]Xingchen Song, Zhiyong Wu, Yiheng Huang, Chao Weng, Dan Su, Helen Meng:
Non-Autoregressive Transformer ASR with CTC-Enhanced Decoder Input. CoRR abs/2010.15025 (2020) - [i12]Haohan Guo, Heng Lu, Na Hu, Chunlei Zhang, Shan Yang, Lei Xie, Dan Su, Dong Yu:
Phonetic Posteriorgrams based Many-to-Many Singing Voice Conversion via Adversarial Training. CoRR abs/2012.01837 (2020)
2010 – 2019
- 2019
- [c23]Yao Du, Zhiyong Wu, Shiyin Kang, Dan Su, Dong Yu, Helen Meng:
Prosodic Structure Prediction using Deep Self-attention Neural Network. APSIPA 2019: 320-324 - [c22]Yao Du, Zhiyong Wu, Shiyin Kang, Dan Su, Dong Yu, Helen Meng:
Automatic Prosodic Structure Labeling using DNN-BGRU-CRF Hybrid Neural Network. APSIPA 2019: 1234-1238 - [c21]Junyi Peng, Yuexian Zou, Na Li, Deyi Tuo, Dan Su, Meng Yu, Chunlei Zhang, Dong Yu:
Syllable-Dependent Discriminative Learning for Small Footprint Text-Dependent Speaker Verification. ASRU 2019: 350-357 - [c20]Bo Wu, Meng Yu, Lianwu Chen, Mingjie Jin, Dan Su, Dong Yu:
Improving Speech Enhancement with Phonetic Embedding Features. ASRU 2019: 645-651 - [c19]Lianwu Chen, Meng Yu, Dan Su, Dong Yu:
Multi-band PIT and Model Integration for Improved Multi-channel Speech Separation. ICASSP 2019: 705-709 - [c18]Changhao Shan, Chao Weng, Guangsen Wang, Dan Su, Min Luo, Dong Yu, Lei Xie:
Component Fusion: Learning Replaceable Language Model Component for End-to-end Speech Recognition System. ICASSP 2019: 5631-5635 - [c17]Jun Wang, Dan Su, Jie Chen, Shulin Feng, Dongpeng Ma, Na Li, Dong Yu:
Learning Discriminative Features in Sequence Training without Requiring Framewise Labelled Data. ICASSP 2019: 5696-5700 - [c16]Changhao Shan, Chao Weng, Guangsen Wang, Dan Su, Min Luo, Dong Yu, Lei Xie:
Investigating End-to-end Speech Recognition for Mandarin-english Code-switching. ICASSP 2019: 6056-6060 - [c15]Rongjin Li, Na Li, Deyi Tuo, Meng Yu, Dan Su, Dong Yu:
Boundary Discriminative Large Margin Cosine Loss for Text-independent Speaker Verification. ICASSP 2019: 6321-6325 - [c14]Zhao You, Dan Su, Dong Yu:
Teach an All-rounder with Experts in Different Domains. ICASSP 2019: 6425-6429 - [c13]Yong Xu, Chao Weng, Like Hui, Jianming Liu, Meng Yu, Dan Su, Dong Yu:
Joint Training of Complex Ratio Mask Based Beamformer and Acoustic Model for Noise Robust Asr. ICASSP 2019: 6745-6749 - [c12]Mu Wang, Xixin Wu, Zhiyong Wu, Shiyin Kang, Deyi Tuo, Guangzhi Li, Dan Su, Dong Yu, Helen Meng:
Quasi-fully Convolutional Neural Network with Variational Inference for Speech Synthesis. ICASSP 2019: 7060-7064 - [c11]Dongyang Dai, Zhiyong Wu, Shiyin Kang, Xixin Wu, Jia Jia, Dan Su, Dong Yu, Helen Meng:
Disambiguation of Chinese Polyphones in an End-to-End Framework with Semantic Features Extracted by Pre-Trained BERT. INTERSPEECH 2019: 2090-2094 - [c10]Max W. Y. Lam, Jun Wang, Xunying Liu, Helen Meng, Dan Su, Dong Yu:
Extract, Adapt and Recognize: An End-to-End Neural Network for Corrupted Monaural Speech Recognition. INTERSPEECH 2019: 2778-2782 - [c9]Rongzhi Gu, Lianwu Chen, Shi-Xiong Zhang, Jimeng Zheng, Yong Xu, Meng Yu, Dan Su, Yuexian Zou, Dong Yu:
Neural Spatial Filter: Target Speaker Speech Separation Assisted with Directional Information. INTERSPEECH 2019: 4290-4294 - [i11]Rongzhi Gu, Jian Wu, Shi-Xiong Zhang, Lianwu Chen, Yong Xu, Meng Yu, Dan Su, Yuexian Zou, Dong Yu:
End-to-End Multi-Channel Speech Separation. CoRR abs/1905.06286 (2019) - [i10]Jun Wang, Dan Su, Jie Chen, Shulin Feng, Dongpeng Ma, Na Li, Dong Yu:
Learning discriminative features in sequence training without requiring framewise labelled data. CoRR abs/1905.06907 (2019) - [i9]Zhao You, Dan Su, Dong Yu:
Teach an all-rounder with experts in different domains. CoRR abs/1907.05698 (2019) - [i8]Yiheng Huang, Liqiang He, Lei Han, Guangsen Wang, Dan Su:
Phrase-Level Class based Language Model for Mandarin Smart Speaker Query Recognition. CoRR abs/1909.00556 (2019) - [i7]Peng Liu, Xixin Wu, Shiyin Kang, Guangzhi Li, Dan Su, Dong Yu:
Maximizing Mutual Information for Tacotron. CoRR abs/1909.01145 (2019) - [i6]Chengzhu Yu, Heng Lu, Na Hu, Meng Yu, Chao Weng, Kun Xu, Peng Liu, Deyi Tuo, Shiyin Kang, Guangzhi Lei, Dan Su, Dong Yu:
DurIAN: Duration Informed Attention Network For Multimodal Synthesis. CoRR abs/1909.01700 (2019) - [i5]Yiheng Huang, Jinchuan Tian, Lei Han, Guangsen Wang, Xingcheng Song, Dan Su, Dong Yu:
A Random Gossip BMUF Process for Neural Language Modeling. CoRR abs/1909.09010 (2019) - [i4]Xingcheng Song, Guangsen Wang, Zhiyong Wu, Yiheng Huang, Dan Su, Dong Yu, Helen Meng:
Speech-XLNet: Unsupervised Acoustic Model Pretraining For Self-Attention Networks. CoRR abs/1910.10387 (2019) - [i3]Max W. Y. Lam, Jun Wang, Dan Su, Dong Yu:
Mixup-breakdown: a consistency training method for improving generalization of speech separation models. CoRR abs/1910.13253 (2019) - [i2]Zhao You, Dan Su, Jie Chen, Chao Weng, Dong Yu:
DFSMN-SAN with Persistent Memory Model for Automatic Speech Recognition. CoRR abs/1910.13282 (2019) - 2018
- [c8]Lianwu Chen, Meng Yu, Yanmin Qian, Dan Su, Dong Yu:
Permutation Invariant Training of Generative Adversarial Network for Monaural Speech Separation. INTERSPEECH 2018: 302-306 - [c7]Jun Wang, Jie Chen, Dan Su, Lianwu Chen, Meng Yu, Yanmin Qian, Dong Yu:
Deep Extractor Network for Target Speaker Recovery from Single Channel Speech Mixtures. INTERSPEECH 2018: 307-311 - [c6]Chao Weng, Jia Cui, Guangsen Wang, Jun Wang, Chengzhu Yu, Dan Su, Dong Yu:
Improving Attention Based Sequence-to-Sequence Models for End-to-End English Conversational Speech Recognition. INTERSPEECH 2018: 761-765 - [c5]Na Li, Deyi Tuo, Dan Su, Zhifeng Li, Dong Yu:
Deep Discriminative Embeddings for Duration Robust Speaker Verification. INTERSPEECH 2018: 2262-2266 - [c4]Meng Yu, Xuan Ji, Yi Gao, Lianwu Chen, Jie Chen, Jimeng Zheng, Dan Su, Dong Yu:
Text-Dependent Speech Enhancement for Small-Footprint Robust Keyword Detection. INTERSPEECH 2018: 2613-2617 - [c3]Xixin Wu, Yuewen Cao, Mu Wang, Songxiang Liu, Shiyin Kang, Zhiyong Wu, Xunying Liu, Dan Su, Dong Yu, Helen Meng:
Rapid Style Adaptation Using Residual Error Embedding for Expressive Speech Synthesis. INTERSPEECH 2018: 3072-3076 - [c2]Mu Wang, Zhiyong Wu, Shiyin Kang, Xixin Wu, Jia Jia, Dan Su, Dong Yu, Helen Meng:
Speech Super-Resolution Using Parallel WaveNet. ISCSLP 2018: 260-264 - [c1]Jia Cui, Chao Weng, Guangsen Wang, Jun Wang, Peidong Wang, Chengzhu Yu, Dan Su, Dong Yu:
Improving Attention-Based End-to-End ASR Systems with Sequence-Based Loss Functions. SLT 2018: 353-360 - [i1]Jun Wang, Jie Chen, Dan Su, Lianwu Chen, Meng Yu, Yanmin Qian, Dong Yu:
Deep Extractor Network for Target Speaker Recovery From Single Channel Speech Mixtures. CoRR abs/1807.08974 (2018) - 2017
- [j1]Kuppusamy Kanagaraj, Kangjie Lin, Wanhua Wu, Guowei Gao, Zhihui Zhong, Dan Su, Cheng Yang:
Chiral Buckybowl Molecules. Symmetry 9(9): 174 (2017)
Coauthor Index
aka: Helen Meng
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-10-17 20:31 CEST by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint