default search action
Zhihao Du
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [j3]Kexin He, Yao Sun, Shuang Xiao, Xiuli Zhang, Zhihao Du, Yanping Zhang:
Effects of High-Load Bench Press Training with Different Blood Flow Restriction Pressurization Strategies on the Degree of Muscle Activation in the Upper Limbs of Bodybuilders. Sensors 24(2): 605 (2024) - [j2]Shuang Cui, Zhihao Du, Nannan Wang, Xiuli Zhang, Zongquan Li, Yanping Zhang, Liang Wang:
Assessing the Post-Activation Performance Enhancement of Upper Limbs in Basketball Athletes: A Sensor-Based Study of Rapid Stretch Compound and Blood Flow Restriction Training. Sensors 24(14): 4439 (2024) - [c22]Zhihao Du, Shiliang Zhang, Kai Hu, Siqi Zheng:
FunCodec: A Fundamental, Reproducible and Integrable Open-Source Toolkit for Neural Speech Codec. ICASSP 2024: 591-595 - [i21]Ziyang Ma, Guanrou Yang, Yifan Yang, Zhifu Gao, Jiaming Wang, Zhihao Du, Fan Yu, Qian Chen, Siqi Zheng, Shiliang Zhang, Xie Chen:
An Embarrassingly Simple Approach for LLM with Strong ASR Capacity. CoRR abs/2402.08846 (2024) - [i20]Keyu An, Qian Chen, Chong Deng, Zhihao Du, Changfeng Gao, Zhifu Gao, Yue Gu, Ting He, Hangrui Hu, Kai Hu, Shengpeng Ji, Yabin Li, Zerui Li, Heng Lu, Haoneng Luo, Xiang Lv, Bin Ma, Ziyang Ma, Chongjia Ni, Changhe Song, Jiaqi Shi, Xian Shi, Hao Wang, Wen Wang, Yuxuan Wang, Zhangyu Xiao, Zhijie Yan, Yexin Yang, Bin Zhang, Qinglin Zhang, Shiliang Zhang, Nan Zhao, Siqi Zheng:
FunAudioLLM: Voice Understanding and Generation Foundation Models for Natural Interaction Between Humans and LLMs. CoRR abs/2407.04051 (2024) - [i19]Zhihao Du, Qian Chen, Shiliang Zhang, Kai Hu, Heng Lu, Yexin Yang, Hangrui Hu, Siqi Zheng, Yue Gu, Ziyang Ma, Zhifu Gao, Zhijie Yan:
CosyVoice: A Scalable Multilingual Zero-shot Text-to-speech Synthesizer based on Supervised Semantic Tokens. CoRR abs/2407.05407 (2024) - [i18]Xin Zhang, Xiang Lyu, Zhihao Du, Qian Chen, Dong Zhang, Hangrui Hu, Chaohong Tan, Tianyu Zhao, Yuxuan Wang, Bin Zhang, Heng Lu, Yaqian Zhou, Xipeng Qiu:
IntrinsicVoice: Empowering LLMs with Intrinsic Real-time Voice Interaction Abilities. CoRR abs/2410.08035 (2024) - 2023
- [c21]Mohan Shi, Jie Zhang, Zhihao Du, Fan Yu, Qian Chen, Shiliang Zhang, Li-Rong Dai:
A Comparative Study on Multichannel Speaker-Attributed Automatic Speech Recognition in Multi-party Meetings. APSIPA ASC 2023: 1943-1948 - [c20]Yangze Li, Fan Yu, Yuhao Liang, Pengcheng Guo, Mohan Shi, Zhihao Du, Shiliang Zhang, Lei Xie:
Sa-Paraformer: Non-Autoregressive End-To-End Speaker-Attributed ASR. ASRU 2023: 1-7 - [c19]Yuhao Liang, Mohan Shi, Fan Yu, Yangze Li, Shiliang Zhang, Zhihao Du, Qian Chen, Lei Xie, Yanmin Qian, Jian Wu, Zhuo Chen, Kong Aik Lee, Zhijie Yan, Hui Bu:
The Second Multi-Channel Multi-Party Meeting Transcription Challenge (M2MeT 2.0): A Benchmark for Speaker-Attributed ASR. ASRU 2023: 1-8 - [c18]Jiaming Wang, Zhihao Du, Shiliang Zhang:
TOLD: a Novel Two-Stage Overlap-Aware Framework for Speaker Diarization. ICASSP 2023: 1-5 - [c17]Zhihao Du, Yike Li, Chao Chen, Zheng Wang:
AttenTPU: Tensor Processor for Attention Mechanism with Fine-Grained Padding. ICTA 2023: 101-102 - [c16]Mohan Shi, Zhihao Du, Qian Chen, Fan Yu, Yangze Li, Shiliang Zhang, Jie Zhang, Li-Rong Dai:
CASA-ASR: Context-Aware Speaker-Attributed ASR. INTERSPEECH 2023: 411-415 - [c15]Yue Gu, Zhihao Du, Shiliang Zhang, Qian Chen, Jiqing Han:
Personality-aware Training based Speaker Adaptation for End-to-end Speech Recognition. INTERSPEECH 2023: 1249-1253 - [c14]Zhifu Gao, Zerui Li, Jiaming Wang, Haoneng Luo, Xian Shi, Mengzhe Chen, Yabin Li, Lingyun Zuo, Zhihao Du, Shiliang Zhang:
FunASR: A Fundamental End-to-End Speech Recognition Toolkit. INTERSPEECH 2023: 1593-1597 - [i17]Jiaming Wang, Zhihao Du, Shiliang Zhang:
TOLD: A Novel Two-Stage Overlap-Aware Framework for Speaker Diarization. CoRR abs/2303.05397 (2023) - [i16]Zhifu Gao, Zerui Li, Jiaming Wang, Haoneng Luo, Xian Shi, Mengzhe Chen, Yabin Li, Lingyun Zuo, Zhihao Du, Zhangyu Xiao, Shiliang Zhang:
FunASR: A Fundamental End-to-End Speech Recognition Toolkit. CoRR abs/2305.11013 (2023) - [i15]Mohan Shi, Zhihao Du, Qian Chen, Fan Yu, Yangze Li, Shiliang Zhang, Jie Zhang, Li-Rong Dai:
CASA-ASR: Context-Aware Speaker-Attributed ASR. CoRR abs/2305.12459 (2023) - [i14]Zhihao Du, Shiliang Zhang, Kai Hu, Siqi Zheng:
FunCodec: A Fundamental, Reproducible and Integrable Open-source Toolkit for Neural Speech Codec. CoRR abs/2309.07405 (2023) - [i13]Yuhao Liang, Mohan Shi, Fan Yu, Yangze Li, Shiliang Zhang, Zhihao Du, Qian Chen, Lei Xie, Yanmin Qian, Jian Wu, Zhuo Chen, Kong Aik Lee, Zhijie Yan, Hui Bu:
The second multi-channel multi-party meeting transcription challenge (M2MeT) 2.0): A benchmark for speaker-attributed ASR. CoRR abs/2309.13573 (2023) - [i12]Jiaming Wang, Zhihao Du, Qian Chen, Yunfei Chu, Zhifu Gao, Zerui Li, Kai Hu, Xiaohuan Zhou, Jin Xu, Ziyang Ma, Wen Wang, Siqi Zheng, Chang Zhou, Zhijie Yan, Shiliang Zhang:
LauraGPT: Listen, Attend, Understand, and Regenerate Audio with GPT. CoRR abs/2310.04673 (2023) - [i11]Yangze Li, Fan Yu, Yuhao Liang, Pengcheng Guo, Mohan Shi, Zhihao Du, Shiliang Zhang, Lei Xie:
SA-Paraformer: Non-autoregressive End-to-End Speaker-Attributed ASR. CoRR abs/2310.04863 (2023) - 2022
- [c13]Zhihao Du, Shiliang Zhang, Siqi Zheng, Zhi-Jie Yan:
Speaker Overlap-aware Neural Diarization for Multi-party Meeting Analysis. EMNLP 2022: 7458-7469 - [c12]Fan Yu, Shiliang Zhang, Yihui Fu, Lei Xie, Siqi Zheng, Zhihao Du, Weilong Huang, Pengcheng Guo, Zhijie Yan, Bin Ma, Xin Xu, Hui Bu:
M2Met: The Icassp 2022 Multi-Channel Multi-Party Meeting Transcription Challenge. ICASSP 2022: 6167-6171 - [c11]Fan Yu, Shiliang Zhang, Pengcheng Guo, Yihui Fu, Zhihao Du, Siqi Zheng, Weilong Huang, Lei Xie, Zheng-Hua Tan, DeLiang Wang, Yanmin Qian, Kong Aik Lee, Zhijie Yan, Bin Ma, Xin Xu, Hui Bu:
Summary on the ICASSP 2022 Multi-Channel Multi-Party Meeting Transcription Grand Challenge. ICASSP 2022: 9156-9160 - [c10]Fan Yu, Zhihao Du, Shiliang Zhang, Yuxiao Lin, Lei Xie:
A Comparative Study on Speaker-attributed Automatic Speech Recognition in Multi-party Meetings. INTERSPEECH 2022: 560-564 - [c9]Yuxiao Lin, Zhihao Du, Shiliang Zhang, Fan Yu, Zhou Zhao, Fei Wu:
Separate-to-Recognize: Joint Multi-target Speech Separation and Speech Recognition for Speaker-attributed ASR. ISCSLP 2022: 150-154 - [c8]Fan Yu, Shiliang Zhang, Pengcheng Guo, Yuhao Liang, Zhihao Du, Yuxiao Lin, Lei Xie:
MFCCA:Multi-Frame Cross-Channel Attention for Multi-Speaker ASR in Multi-Party Meeting Scenario. SLT 2022: 144-151 - [i10]Fan Yu, Shiliang Zhang, Pengcheng Guo, Yihui Fu, Zhihao Du, Siqi Zheng, Weilong Huang, Lei Xie, Zheng-Hua Tan, DeLiang Wang, Yanmin Qian, Kong Aik Lee, Zhijie Yan, Bin Ma, Xin Xu, Hui Bu:
Summary On The ICASSP 2022 Multi-Channel Multi-Party Meeting Transcription Grand Challenge. CoRR abs/2202.03647 (2022) - [i9]Zhihao Du, Shiliang Zhang, Siqi Zheng, Zhijie Yan:
Speaker Embedding-aware Neural Diarization: an Efficient Framework for Overlapping Speech Diarization in Meeting Scenarios. CoRR abs/2203.09767 (2022) - [i8]Fan Yu, Zhihao Du, Shiliang Zhang, Yuxiao Lin, Lei Xie:
A Comparative Study on Speaker-attributed Automatic Speech Recognition in Multi-party Meetings. CoRR abs/2203.16834 (2022) - [i7]Fan Yu, Shiliang Zhang, Pengcheng Guo, Yuhao Liang, Zhihao Du, Yuxiao Lin, Lei Xie:
MFCCA: Multi-Frame Cross-Channel attention for multi-speaker ASR in Multi-party meeting scenario. CoRR abs/2210.05265 (2022) - [i6]Zhihao Du, Shiliang Zhang, Siqi Zheng, Zhijie Yan:
Speaker Overlap-aware Neural Diarization for Multi-party Meeting Analysis. CoRR abs/2211.10243 (2022) - 2021
- [c7]Hongwei Song, Jiqing Han, Shiwen Deng, Zhihao Du:
Capturing Temporal Dependencies Through Future Prediction for CNN-Based Audio Classifiers. ICASSP 2021: 101-105 - [i5]Fan Yu, Shiliang Zhang, Yihui Fu, Lei Xie, Siqi Zheng, Zhihao Du, Weilong Huang, Pengcheng Guo, Zhijie Yan, Bin Ma, Xin Xu, Hui Bu:
M2MeT: The ICASSP 2022 Multi-Channel Multi-Party Meeting Transcription Challenge. CoRR abs/2110.07393 (2021) - [i4]Zhihao Du, Shiliang Zhang, Siqi Zheng, Weilong Huang, Ming Lei:
Speaker Embedding-aware Neural Diarization for Flexible Number of Speakers with Textual Information. CoRR abs/2111.13694 (2021) - 2020
- [j1]Zhihao Du, Xueliang Zhang, Jiqing Han:
A Joint Framework of Denoising Autoencoder and Generative Vocoder for Monaural Speech Enhancement. IEEE ACM Trans. Audio Speech Lang. Process. 28: 1493-1505 (2020) - [c6]Zhihao Du, Ming Lei, Jiqing Han, Shiliang Zhang:
Pan: Phoneme-Aware Network for Monaural Speech Enhancement. ICASSP 2020: 6634-6638 - [c5]Yue Gu, Zhihao Du, Hui Zhang, Xueliang Zhang:
An Efficient Joint Training Framework for Robust Small-Footprint Keyword Spotting. ICONIP (1) 2020: 12-23 - [c4]Zhihao Du, Jiqing Han, Xueliang Zhang:
Double Adversarial Network Based Monaural Speech Enhancement for Robust Speech Recognition. INTERSPEECH 2020: 309-313 - [c3]Zhihao Du, Ming Lei, Jiqing Han, Shiliang Zhang:
Self-Supervised Adversarial Multi-Task Learning for Vocoder-Based Monaural Speech Enhancement. INTERSPEECH 2020: 3271-3275
2010 – 2019
- 2019
- [c2]Zhihao Du, Xueliang Zhang, Jiqing Han:
Investigation of Monaural Front-End Processing for Robust Speech Recognition Without Retraining or Joint-Training. APSIPA 2019: 249-254 - [c1]Hongwei Song, Jiqing Han, Shiwen Deng, Zhihao Du:
Acoustic Scene Classification by Implicitly Identifying Distinct Sound Events. INTERSPEECH 2019: 3860-3864 - [i3]Hongwei Song, Jiqing Han, Shiwen Deng, Zhihao Du:
Acoustic Scene Classification by Implicitly Identifying Distinct Sound Events. CoRR abs/1904.05204 (2019) - [i2]Yue Gu, Zhihao Du, Hui Zhang, Xueliang Zhang:
A Monaural Speech Enhancement Method for Robust Small-Footprint Keyword Spotting. CoRR abs/1906.08415 (2019) - 2018
- [i1]Zhihao Du, Xueliang Zhang, Jiqing Han:
Investigation of Monaural Front-End Processing for Robust ASR without Retraining or Joint-Training. CoRR abs/1810.09067 (2018)
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-11-19 20:50 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint