default search action
Yong Xu 0004
Person information
- affiliation: Tencent America LLC, Seattle, USA
- affiliation (former): University of Surrey, Centre for Vision, Speech and Signal Processing, Guildford, UK
- affiliation (PhD 2015): University of Science and Technology of China, Hefei, China
Other persons with the same name
- Yong Xu — disambiguation page
- Yong Xu 0001 — Harbin Institute of Technology, Shenzhen Graduate School, Bio-Computing Research Centre, China (and 1 more)
- Yong Xu 0002 (aka: Eric Yong Xu) — The Chinese University of Hong Kong, Institute of Future Cities, Hong Kong
- Yong Xu 0003 — Guangdong University of Technology, School of Automation, Guangzhou, China (and 3 more)
- Yong Xu 0005 — Zhejiang University, Institute of Cyber-Systems and Control, National Laboratory of Industrial Control Technology, Yuquan Campus, Hangzhou, China
- Yong Xu 0006 — University of Paris-Saclay, France
- Yong Xu 0007 — South China University of Technology, School of Computer Science and Engineering, Guangzhou, China (and 1 more)
- Yong Xu 0008 — Fudan University, Shanghai, China
- Yong Xu 0009 — Xiamen University, School of Film, Department of Digital Media, Xiamen, China (and 6 more)
- Yong Xu 0010 — Microsoft Research, Beijing, China
- Yong Xu 0011 — Northwestern Polytechnical University, Department of Mathematics and Statistics, Xi'an, China
- Yong Xu 0012 — Tokyo University of Agriculture and Technology, Division of Advanced Information Technology and Computer Science, Institute of Engineering, Japan (and 1 more)
- Yong Xu 0013 — Beihang University, Institute of Unmanned Systems, Key Laboratory of Advanced Technology of Intelligent Unmanned Flight Systems, School of Electronic and Information Engineering, Beijing, China
- Yong Xu 0014 — State Grid Hunan Integrated Energy Service Company Ltd., Technology Research and Development Center, Changsha, China (and 1 more)
- Yong Xu 0015 — University of Cincinnati, Wireless and Mobile Networking Laboratory, OH, USA
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [j14]Yang Liu, Yong Xu, Peipei Wu, Wenwu Wang:
Labelled Non-Zero Diffusion Particle Flow SMC-PHD Filtering for Multi-Speaker Tracking. IEEE Trans. Multim. 26: 2544-2559 (2024) - [c59]Jinzheng Zhao, Xinyuan Qian, Yong Xu, Haohe Liu, Yin Cao, Davide Berghi, Wenwu Wang:
Text-Queried Target Sound Event Localization. EUSIPCO 2024: 261-265 - [c58]Zhongweiyang Xu, Yong Xu, Vinay Kothapally, Heming Wang, Muqiao Yang, Dong Yu:
SPATIALCODEC: Neural Spatial Speech Coding. ICASSP 2024: 1131-1135 - [c57]Muqiao Yang, Chunlei Zhang, Yong Xu, Zhongweiyang Xu, Heming Wang, Bhiksha Raj, Dong Yu:
uSee: Unified Speech Enhancement And Editing with Conditional Diffusion Models. ICASSP 2024: 7125-7129 - [i49]Mohan Shi, Zengrui Jin, Yaoxun Xu, Yong Xu, Shi-Xiong Zhang, Kun Wei, Yiwen Shao, Chunlei Zhang, Dong Yu:
Advancing Multi-talker ASR Performance with Large Language Models. CoRR abs/2408.17431 (2024) - [i48]Zengrui Jin, Yifan Yang, Mohan Shi, Wei Kang, Xiaoyu Yang, Zengwei Yao, Fangjun Kuang, Liyong Guo, Lingwei Meng, Long Lin, Yong Xu, Shi-Xiong Zhang, Daniel Povey:
LibriheavyMix: A 20,000-Hour Dataset for Single-Channel Reverberant Multi-Talker Speech Separation, ASR and Speaker Diarization. CoRR abs/2409.00819 (2024) - [i47]Jiarui Hai, Yong Xu, Hao Zhang, Chenxing Li, Helin Wang, Mounya Elhilali, Dong Yu:
EzAudio: Enhancing Text-to-Audio Generation with Efficient Diffusion Transformer. CoRR abs/2409.10819 (2024) - 2023
- [c56]Meng Yu, Yong Xu, Chunlei Zhang, Shi-Xiong Zhang, Dong Yu:
Neuralecho: Hybrid of Full-Band and Sub-Band Recurrent Neural Network For Acoustic Echo Cancellation and Speech Enhancement. ASRU 2023: 1-8 - [c55]Vinay Kothapally, Yong Xu, Meng Yu, Shi-Xiong Zhang, Dong Yu:
Deep Neural Mel-Subband Beamformer for in-Car Speech Separation. ICASSP 2023: 1-5 - [c54]Yong Xu, Vinay Kothapally, Meng Yu, Shixiong Zhang, Dong Yu:
Zoneformer: On-device Neural Beamformer For In-car Multi-zone Speech Separation, Enhancement and Echo Cancellation. INTERSPEECH 2023: 5117-5121 - [i46]Zhongweiyang Xu, Yong Xu, Vinay Kothapally, Heming Wang, Muqiao Yang, Dong Yu:
SpatialCodec: Neural Spatial Speech Coding. CoRR abs/2309.07432 (2023) - [i45]Jinzheng Zhao, Yong Xu, Xinyuan Qian, Wenwu Wang:
Audio Visual Speaker Localization from EgoCentric Views. CoRR abs/2309.16308 (2023) - [i44]Muqiao Yang, Chunlei Zhang, Yong Xu, Zhongweiyang Xu, Heming Wang, Bhiksha Raj, Dong Yu:
uSee: Unified Speech Enhancement and Editing with Conditional Diffusion Models. CoRR abs/2310.00900 (2023) - [i43]Jinzheng Zhao, Yong Xu, Xinyuan Qian, Davide Berghi, Peipei Wu, Meng Cui, Jianyuan Sun, Philip J. B. Jackson, Wenwu Wang:
Audio-Visual Speaker Tracking: Progress, Challenges, and Future Directions. CoRR abs/2310.14778 (2023) - 2022
- [c53]Jinzheng Zhao, Peipei Wu, Shidrokh Goudarzi, Xubo Liu, Jianyuan Sun, Yong Xu, Wenwu Wang:
Visually Assisted Self-supervised Audio Speaker Localization and Tracking. EUSIPCO 2022: 787-791 - [c52]Jinzheng Zhao, Peipei Wu, Xubo Liu, Yong Xu, Lyudmila Mihaylova, Simon J. Godsill, Wenwu Wang:
Audio-Visual Tracking of Multiple Speakers Via a PMBM Filter. ICASSP 2022: 5068-5072 - [c51]Vinay Kothapally, Yong Xu, Meng Yu, Shi-Xiong Zhang, Dong Yu:
Joint Neural AEC and Beamforming with Double-Talk Detection. INTERSPEECH 2022: 2528-2532 - [c50]Jinzheng Zhao, Peipei Wu, Xubo Liu, Shidrokh Goudarzi, Haohe Liu, Yong Xu, Wenwu Wang:
Audio Visual Multi-Speaker Tracking with Improved GCF and PMBM Filter. INTERSPEECH 2022: 3704-3708 - [c49]Soumi Maiti, Yushi Ueda, Shinji Watanabe, Chunlei Zhang, Meng Yu, Shi-Xiong Zhang, Yong Xu:
EEND-SS: Joint End-to-End Neural Speaker Diarization and Speech Separation for Flexible Number of Speakers. SLT 2022: 480-487 - [i42]Yushi Ueda, Soumi Maiti, Shinji Watanabe, Chunlei Zhang, Meng Yu, Shi-Xiong Zhang, Yong Xu:
EEND-SS: Joint End-to-End Neural Speaker Diarization and Speech Separation for Flexible Number of Speakers. CoRR abs/2203.17068 (2022) - [i41]Meng Yu, Yong Xu, Chunlei Zhang, Shi-Xiong Zhang, Dong Yu:
NeuralEcho: A Self-Attentive Recurrent Neural Network For Unified Acoustic Echo Suppression And Speech Enhancement. CoRR abs/2205.10401 (2022) - [i40]Qiuqiang Kong, Shilei Liu, Junjie Shi, Xuzhou Ye, Yin Cao, Qiaoxi Zhu, Yong Xu, Yuxuan Wang:
Neural Sound Field Decomposition with Super-resolution of Sound Direction. CoRR abs/2210.12345 (2022) - [i39]Vinay Kothapally, Yong Xu, Meng Yu, Shi-Xiong Zhang, Dong Yu:
Deep Neural Mel-Subband Beamformer for In-car Speech Separation. CoRR abs/2211.12590 (2022) - 2021
- [j13]Daniel Michelsanti, Zheng-Hua Tan, Shi-Xiong Zhang, Yong Xu, Meng Yu, Dong Yu, Jesper Jensen:
An Overview of Deep-Learning-Based Audio-Visual Speech Enhancement and Separation. IEEE ACM Trans. Audio Speech Lang. Process. 29: 1368-1396 (2021) - [j12]Zhuohuang Zhang, Yong Xu, Meng Yu, Shi-Xiong Zhang, Lianwu Chen, Donald S. Williamson, Dong Yu:
Multi-Channel Multi-Frame ADL-MVDR for Target Speech Separation. IEEE ACM Trans. Audio Speech Lang. Process. 29: 3526-3540 (2021) - [c48]Zhuohuang Zhang, Yong Xu, Meng Yu, Shi-Xiong Zhang, Lianwu Chen, Dong Yu:
ADL-MVDR: All Deep Learning MVDR Beamformer for Target Speech Separation. ICASSP 2021: 6089-6093 - [c47]Aswin Shanmugam Subramanian, Chao Weng, Shinji Watanabe, Meng Yu, Yong Xu, Shi-Xiong Zhang, Dong Yu:
Directional ASR: A New Paradigm for E2E Multi-Speaker Speech Recognition with Source Localization. ICASSP 2021: 8433-8437 - [c46]Helin Wang, Bo Wu, Lianwu Chen, Meng Yu, Jianwei Yu, Yong Xu, Shi-Xiong Zhang, Chao Weng, Dan Su, Dong Yu:
TeCANet: Temporal-Contextual Attention Network for Environment-Aware Speech Dereverberation. Interspeech 2021: 1109-1113 - [c45]Xiyun Li, Yong Xu, Meng Yu, Shi-Xiong Zhang, Jiaming Xu, Bo Xu, Dong Yu:
MIMO Self-Attentive RNN Beamformer for Multi-Speaker Speech Separation. Interspeech 2021: 1119-1123 - [c44]Meng Yu, Chunlei Zhang, Yong Xu, Shi-Xiong Zhang, Dong Yu:
MetricNet: Towards Improved Modeling For Non-Intrusive Speech Quality Assessment. Interspeech 2021: 2142-2146 - [c43]Yong Xu, Zhuohuang Zhang, Meng Yu, Shi-Xiong Zhang, Dong Yu:
Generalized Spatio-Temporal RNN Beamformer for Target Speech Separation. Interspeech 2021: 3076-3080 - [c42]Jianming Liu, Meng Yu, Yong Xu, Chao Weng, Shi-Xiong Zhang, Lianwu Chen, Dong Yu:
Neural Mask based Multi-channel Convolutional Beamforming for Joint Dereverberation, Echo Cancellation and Denoising. SLT 2021: 766-770 - [c41]Zhaoheng Ni, Yong Xu, Meng Yu, Bo Wu, Shi-Xiong Zhang, Dong Yu, Michael I. Mandel:
WPD++: An Improved Neural Beamformer for Simultaneous Speech Separation and Dereverberation. SLT 2021: 817-824 - [i38]Yong Xu, Zhuohuang Zhang, Meng Yu, Shi-Xiong Zhang, Lianwu Chen, Dong Yu:
Generalized RNN beamformer for target speech separation. CoRR abs/2101.01280 (2021) - [i37]Helin Wang, Bo Wu, Lianwu Chen, Meng Yu, Jianwei Yu, Yong Xu, Shi-Xiong Zhang, Chao Weng, Dan Su, Dong Yu:
TeCANet: Temporal-Contextual Attention Network for Environment-Aware Speech Dereverberation. CoRR abs/2103.16849 (2021) - [i36]Meng Yu, Chunlei Zhang, Yong Xu, Shi-Xiong Zhang, Dong Yu:
MetricNet: Towards Improved Modeling For Non-Intrusive Speech Quality Assessment. CoRR abs/2104.01227 (2021) - [i35]Xiyun Li, Yong Xu, Meng Yu, Shi-Xiong Zhang, Jiaming Xu, Bo Xu, Dong Yu:
MIMO Self-attentive RNN Beamformer for Multi-speaker Speech Separation. CoRR abs/2104.08450 (2021) - [i34]Vinay Kothapally, Yong Xu, Meng Yu, Shi-Xiong Zhang, Dong Yu:
Joint AEC AND Beamforming with Double-Talk Detection using RNN-Transformer. CoRR abs/2111.04904 (2021) - 2020
- [j11]Rongzhi Gu, Shi-Xiong Zhang, Yong Xu, Lianwu Chen, Yuexian Zou, Dong Yu:
Multi-Modal Multi-Channel Target Speech Separation. IEEE J. Sel. Top. Signal Process. 14(3): 530-541 (2020) - [j10]Ke Tan, Yong Xu, Shi-Xiong Zhang, Meng Yu, Dong Yu:
Audio-Visual Speech Separation and Dereverberation With a Two-Stage Multimodal Network. IEEE J. Sel. Top. Signal Process. 14(3): 542-553 (2020) - [j9]Qiuqiang Kong, Yong Xu, Wenwu Wang, Mark D. Plumbley:
Sound Event Detection of Weakly Labelled Data With CNN-Transformer and Automatic Threshold Optimization. IEEE ACM Trans. Audio Speech Lang. Process. 28: 2450-2460 (2020) - [c40]Yifan Ding, Yong Xu, Shi-Xiong Zhang, Yahuan Cong, Liqiang Wang:
Self-Supervised Learning for Audio-Visual Speaker Diarization. ICASSP 2020: 4367-4371 - [c39]Aswin Shanmugam Subramanian, Chao Weng, Meng Yu, Shi-Xiong Zhang, Yong Xu, Shinji Watanabe, Dong Yu:
Far-Field Location Guided Target Speech Extraction Using End-to-End Speech Recognition Objectives. ICASSP 2020: 7299-7303 - [c38]Rongzhi Gu, Shi-Xiong Zhang, Lianwu Chen, Yong Xu, Meng Yu, Dan Su, Yuexian Zou, Dong Yu:
Enhancing End-to-End Multi-Channel Speech Separation Via Spatial Feature Learning. ICASSP 2020: 7319-7323 - [c37]Yong Xu, Meng Yu, Shi-Xiong Zhang, Lianwu Chen, Chao Weng, Jianming Liu, Dong Yu:
Neural Spatio-Temporal Beamformer for Target Speech Separation. INTERSPEECH 2020: 56-60 - [c36]Jianwei Yu, Bo Wu, Rongzhi Gu, Shi-Xiong Zhang, Lianwu Chen, Yong Xu, Meng Yu, Dan Su, Dong Yu, Xunying Liu, Helen Meng:
Audio-Visual Multi-Channel Recognition of Overlapped Speech. INTERSPEECH 2020: 3496-3500 - [i33]Yifan Ding, Yong Xu, Shi-Xiong Zhang, Yahuan Cong, Liqiang Wang:
Self-supervised learning for audio-visual speaker diarization. CoRR abs/2002.05314 (2020) - [i32]Rongzhi Gu, Shi-Xiong Zhang, Lianwu Chen, Yong Xu, Meng Yu, Dan Su, Yuexian Zou, Dong Yu:
Enhancing End-to-End Multi-channel Speech Separation via Spatial Feature Learning. CoRR abs/2003.03927 (2020) - [i31]Rongzhi Gu, Shi-Xiong Zhang, Yong Xu, Lianwu Chen, Yuexian Zou, Dong Yu:
Multi-modal Multi-channel Target Speech Separation. CoRR abs/2003.07032 (2020) - [i30]Yong Xu, Meng Yu, Shi-Xiong Zhang, Lianwu Chen, Chao Weng, Jianming Liu, Dong Yu:
Neural Spatio-Temporal Beamformer for Target Speech Separation. CoRR abs/2005.03889 (2020) - [i29]Jianwei Yu, Bo Wu, Rongzhi Gu, Shi-Xiong Zhang, Lianwu Chen, Yong Xu, Meng Yu, Dan Su, Dong Yu, Xunying Liu, Helen Meng:
Audio-visual Multi-channel Recognition of Overlapped Speech. CoRR abs/2005.08571 (2020) - [i28]Daniel Michelsanti, Zheng-Hua Tan, Shi-Xiong Zhang, Yong Xu, Meng Yu, Dong Yu, Jesper Jensen:
An Overview of Deep-Learning-Based Audio-Visual Speech Enhancement and Separation. CoRR abs/2008.09586 (2020) - [i27]Aswin Shanmugam Subramanian, Chao Weng, Shinji Watanabe, Meng Yu, Yong Xu, Shi-Xiong Zhang, Dong Yu:
Directional ASR: A New Paradigm for E2E Multi-Speaker Speech Recognition with Source Localization. CoRR abs/2011.00091 (2020) - [i26]Zhaoheng Ni, Yong Xu, Meng Yu, Bo Wu, Shi-Xiong Zhang, Dong Yu, Michael I. Mandel:
WPD++: An Improved Neural Beamformer for Simultaneous Speech Separation and Dereverberation. CoRR abs/2011.09162 (2020) - [i25]Zhuohuang Zhang, Yong Xu, Meng Yu, Shi-Xiong Zhang, Lianwu Chen, Donald S. Williamson, Dong Yu:
Multi-channel Multi-frame ADL-MVDR for Target Speech Separation. CoRR abs/2012.13442 (2020)
2010 – 2019
- 2019
- [j8]Qiuqiang Kong, Yong Xu, Iwona Sobieraj, Wenwu Wang, Mark D. Plumbley:
Sound Event Detection and Time-Frequency Segmentation from Weakly Labelled Data. IEEE ACM Trans. Audio Speech Lang. Process. 27(4): 777-787 (2019) - [j7]Qiuqiang Kong, Changsong Yu, Yong Xu, Turab Iqbal, Wenwu Wang, Mark D. Plumbley:
Weakly Labelled AudioSet Tagging With Attention Neural Networks. IEEE ACM Trans. Audio Speech Lang. Process. 27(11): 1791-1802 (2019) - [c35]Jian Wu, Yong Xu, Shi-Xiong Zhang, Lianwu Chen, Meng Yu, Lei Xie, Dong Yu:
Time Domain Audio Visual Speech Separation. ASRU 2019: 667-673 - [c34]Qiuqiang Kong, Yong Xu, Turab Iqbal, Yin Cao, Wenwu Wang, Mark D. Plumbley:
Acoustic Scene Generation with Conditional Samplernn. ICASSP 2019: 925-929 - [c33]Yong Xu, Chao Weng, Like Hui, Jianming Liu, Meng Yu, Dan Su, Dong Yu:
Joint Training of Complex Ratio Mask Based Beamformer and Acoustic Model for Noise Robust Asr. ICASSP 2019: 6745-6749 - [c32]Xiang Hao, Changhao Shan, Yong Xu, Sining Sun, Lei Xie:
An Attention-based Neural Network Approach for Single Channel Speech Enhancement. ICASSP 2019: 6895-6899 - [c31]Qiuqiang Kong, Yong Xu, Philip J. B. Jackson, Wenwu Wang, Mark D. Plumbley:
Single-Channel Signal Separation and Deconvolution with Generative Adversarial Networks. IJCAI 2019: 2747-2753 - [c30]Jian Wu, Yong Xu, Shi-Xiong Zhang, Lianwu Chen, Meng Yu, Lei Xie, Dong Yu:
Improved Speaker-Dependent Separation for CHiME-5 Challenge. INTERSPEECH 2019: 466-470 - [c29]Rongzhi Gu, Lianwu Chen, Shi-Xiong Zhang, Jimeng Zheng, Yong Xu, Meng Yu, Dan Su, Yuexian Zou, Dong Yu:
Neural Spatial Filter: Target Speaker Speech Separation Assisted with Directional Information. INTERSPEECH 2019: 4290-4294 - [c28]Fahimeh Bahmaninezhad, Jian Wu, Rongzhi Gu, Shi-Xiong Zhang, Yong Xu, Meng Yu, Dong Yu:
A Comprehensive Study of Speech Separation: Spectrogram vs Waveform Separation. INTERSPEECH 2019: 4574-4578 - [i24]Qiuqiang Kong, Changsong Yu, Turab Iqbal, Yong Xu, Wenwu Wang, Mark D. Plumbley:
Weakly labelled AudioSet Classification with Attention Neural Networks. CoRR abs/1903.00765 (2019) - [i23]Qiuqiang Kong, Yin Cao, Turab Iqbal, Yong Xu, Wenwu Wang, Mark D. Plumbley:
Cross-task learning for audio tagging, sound event detection and spatial localization: DCASE 2019 baseline systems. CoRR abs/1904.03476 (2019) - [i22]Jian Wu, Yong Xu, Shi-Xiong Zhang, Lianwu Chen, Meng Yu, Lei Xie, Dong Yu:
Time Domain Audio Visual Speech Separation. CoRR abs/1904.03760 (2019) - [i21]Jian Wu, Yong Xu, Shi-Xiong Zhang, Lianwu Chen, Meng Yu, Lei Xie, Dong Yu:
Improved Speaker-Dependent Separation for CHiME-5 Challenge. CoRR abs/1904.03792 (2019) - [i20]Rongzhi Gu, Jian Wu, Shi-Xiong Zhang, Lianwu Chen, Yong Xu, Meng Yu, Dan Su, Yuexian Zou, Dong Yu:
End-to-End Multi-Channel Speech Separation. CoRR abs/1905.06286 (2019) - [i19]Fahimeh Bahmaninezhad, Jian Wu, Rongzhi Gu, Shi-Xiong Zhang, Yong Xu, Meng Yu, Dong Yu:
A comprehensive study of speech separation: spectrogram vs waveform separation. CoRR abs/1905.07497 (2019) - [i18]Qiuqiang Kong, Yong Xu, Wenwu Wang, Philip J. B. Jackson, Mark D. Plumbley:
Single-Channel Signal Separation and Deconvolution with Generative Adversarial Networks. CoRR abs/1906.07552 (2019) - [i17]Ke Tan, Yong Xu, Shi-Xiong Zhang, Meng Yu, Dong Yu:
Audio-Visual Speech Separation and Dereverberation with a Two-Stage Multimodal Network. CoRR abs/1909.07352 (2019) - [i16]Qiuqiang Kong, Yong Xu, Wenwu Wang, Mark D. Plumbley:
Sound Event Detection of Weakly Labelled Data with CNN-Transformer and Automatic Threshold Optimization. CoRR abs/1912.04761 (2019) - [i15]Fahimeh Bahmaninezhad, Shi-Xiong Zhang, Yong Xu, Meng Yu, John H. L. Hansen, Dong Yu:
A Unified Framework for Speech Separation. CoRR abs/1912.07814 (2019) - 2018
- [j6]Lei Sun, Jun Du, Zhipeng Xie, Yong Xu:
Auxiliary Features from Laser-Doppler Vibrometer Sensor for Deep Neural Network Based Robust Speech Recognition. J. Signal Process. Syst. 90(7): 975-983 (2018) - [c27]Qiuqiang Kong, Turab Iqbal, Yong Xu, Wenwu Wang, Mark D. Plumbley:
DCASE 2018 Challenge Surrey cross-task convolutional neural network baseline. DCASE 2018: 217-221 - [c26]Turab Iqbal, Yong Xu, Qiuqiang Kong, Wenwu Wang:
Capsule Routing for Sound Event Detection. EUSIPCO 2018: 2255-2259 - [c25]Tijs Duel, David M. Frohlich, Christian Kroos, Yong Xu, Philip J. B. Jackson, Mark D. Plumbley:
Supporting Audiography: Design of a System for Sentimental Sound Recording, Classification and Playback. HCI (28) 2018: 24-31 - [c24]Alfredo Zermini, Qiuqiang Kong, Yong Xu, Mark D. Plumbley, Wenwu Wang:
Improving Reverberant Speech Separation with Binaural Cues Using Temporal Context and Convolutional Neural Networks. LVA/ICA 2018: 361-371 - [c23]Yong Xu, Qiuqiang Kong, Wenwu Wang, Mark D. Plumbley:
Large-Scale Weakly Supervised Audio Classification Using Gated Convolutional Neural Network. ICASSP 2018: 121-125 - [c22]Qiuqiang Kong, Yong Xu, Wenwu Wang, Mark D. Plumbley:
Audio Set Classification with Attention Model: A Probabilistic Perspective. ICASSP 2018: 316-320 - [c21]Qiuqiang Kong, Yong Xu, Wenwu Wang, Mark D. Plumbley:
A Joint Separation-Classification Model for Sound Event Detection of Weakly Labelled Data. ICASSP 2018: 321-325 - [c20]Qingju Liu, Yong Xu, Philip J. B. Jackson, Wenwu Wang, Philip Coleman:
Iterative Deep Neural Networks for Speaker-Independent Binaural Blind Speech Separation. ICASSP 2018: 541-545 - [i14]Qiuqiang Kong, Yong Xu, Iwona Sobieraj, Wenwu Wang, Mark D. Plumbley:
Sound Event Detection and Time-Frequency Segmentation from Weakly Labelled Data. CoRR abs/1804.04715 (2018) - [i13]Turab Iqbal, Yong Xu, Qiuqiang Kong, Wenwu Wang:
Capsule Routing for Sound Event Detection. CoRR abs/1806.04699 (2018) - [i12]Qiuqiang Kong, Turab Iqbal, Yong Xu, Wenwu Wang, Mark D. Plumbley:
DCASE 2018 Challenge baseline with convolutional neural networks. CoRR abs/1808.00773 (2018) - 2017
- [j5]Jun Du, Yong Xu:
Hierarchical deep neural network for multivariate regression. Pattern Recognit. 63: 149-157 (2017) - [j4]Yong Xu, Qiang Huang, Wenwu Wang, Peter Foster, Siddharth Sigtia, Philip J. B. Jackson, Mark D. Plumbley:
Unsupervised Feature Learning Based on Deep Models for Environmental Audio Tagging. IEEE ACM Trans. Audio Speech Lang. Process. 25(6): 1230-1241 (2017) - [c19]Qiuqiang Kong, Yong Xu, Mark D. Plumbley:
Joint detection and classification convolutional neural network on weakly labelled bird audio detection. EUSIPCO 2017: 1749-1753 - [c18]Qiuqiang Kong, Yong Xu, Wenwu Wang, Mark D. Plumbley:
A joint detection-classification model for audio tagging of weakly labelled data. ICASSP 2017: 641-645 - [c17]Qiang Huang, Yong Xu, Philip J. B. Jackson, Wenwu Wang, Mark D. Plumbley:
Fast tagging of natural sounds using marginal co-regularization. ICASSP 2017: 2991-2995 - [c16]Yong Xu, Qiuqiang Kong, Qiang Huang, Wenwu Wang, Mark D. Plumbley:
Convolutional gated recurrent neural network incorporating spatial features for audio tagging. IJCNN 2017: 3461-3466 - [c15]Yong Xu, Qiuqiang Kong, Qiang Huang, Wenwu Wang, Mark D. Plumbley:
Attention and Localization Based on a Deep Convolutional Recurrent Model for Weakly Supervised Audio Tagging. INTERSPEECH 2017: 3083-3087 - [c14]Alfredo Zermini, Qingju Liu, Yong Xu, Mark D. Plumbley, Dave Betts, Wenwu Wang:
Binaural and log-power spectra features with deep neural networks for speech-noise separation. MMSP 2017: 1-6 - [i11]Yong Xu, Qiuqiang Kong, Qiang Huang, Wenwu Wang, Mark D. Plumbley:
Convolutional Gated Recurrent Neural Network Incorporating Spatial Features for Audio Tagging. CoRR abs/1702.07787 (2017) - [i10]Yong Xu, Qiuqiang Kong, Qiang Huang, Wenwu Wang, Mark D. Plumbley:
Attention and Localization based on a Deep Convolutional Recurrent Model for Weakly Supervised Audio Tagging. CoRR abs/1703.06052 (2017) - [i9]Yong Xu, Jun Du, Zhen Huang, Li-Rong Dai, Chin-Hui Lee:
Multi-Objective Learning and Mask-Based Post-Processing for Deep Neural Network Based Speech Enhancement. CoRR abs/1703.07172 (2017) - [i8]Yong Xu, Qiuqiang Kong, Wenwu Wang, Mark D. Plumbley:
Surrey-cvssp system for DCASE2017 challenge task4. CoRR abs/1709.00551 (2017) - [i7]Yong Xu, Qiuqiang Kong, Wenwu Wang, Mark D. Plumbley:
Large-scale weakly supervised audio classification using gated convolutional neural network. CoRR abs/1710.00343 (2017) - [i6]Qiuqiang Kong, Yong Xu, Wenwu Wang, Mark D. Plumbley:
Audio Set classification with attention model: A probabilistic perspective. CoRR abs/1711.00927 (2017) - [i5]Qiuqiang Kong, Yong Xu, Wenwu Wang, Mark D. Plumbley:
A joint separation-classification model for sound event detection of weakly labelled data. CoRR abs/1711.03037 (2017) - 2016
- [j3]Tian Gao, Jun Du, Yong Xu, Cong Liu, Li-Rong Dai, Chin-Hui Lee:
Joint training of DNNs by incorporating an explicit dereverberation structure for distant speech recognition. EURASIP J. Adv. Signal Process. 2016: 86 (2016) - [c13]Yong Xu, Qiang Huang, Wenwu Wang, Philip J. B. Jackson, Mark D. Plumbley:
Fully DNN-Based Multi-Label Regression for Audio Tagging. DCASE 2016: 105-109 - [c12]Yong Xu, Qiang Huang, Wenwu Wang, Mark D. Plumbley:
Hierarchical Learning for DNN-Based Acoustic Scene Classification. DCASE 2016: 110-114 - [c11]Zhipeng Xie, Jun Du, Ian McLoughlin, Yong Xu, Feng Ma, Haikun Wang:
Deep neural network for robust speech recognition with auxiliary features from laser-Doppler vibrometer sensor. ISCSLP 2016: 1-5 - [i4]Yong Xu, Qiang Huang, Wenwu Wang, Philip J. B. Jackson, Mark D. Plumbley:
Fully DNN-based Multi-label regression for audio tagging. CoRR abs/1606.07695 (2016) - [i3]Yong Xu, Qiang Huang, Wenwu Wang, Peter Foster, Siddharth Sigtia, Philip J. B. Jackson, Mark D. Plumbley:
Fully Deep Neural Networks Incorporating Unsupervised Feature Learning for Audio Tagging. CoRR abs/1607.03681 (2016) - [i2]Yong Xu, Qiang Huang, Wenwu Wang, Mark D. Plumbley:
Hierachical learning for DNN-based acoustic scene classification. CoRR abs/1607.03682 (2016) - [i1]Qiuqiang Kong, Yong Xu, Wenwu Wang, Mark D. Plumbley:
A Joint Detection-Classification Model for Audio Tagging of Weakly Labelled Data. CoRR abs/1610.01797 (2016) - 2015
- [j2]Yong Xu, Jun Du, Li-Rong Dai, Chin-Hui Lee:
A Regression Approach to Speech Enhancement Based on Deep Neural Networks. IEEE ACM Trans. Audio Speech Lang. Process. 23(1): 7-19 (2015) - [c10]Tian Gao, Jun Du, Yong Xu, Cong Liu, Li-Rong Dai, Chin-Hui Lee:
Improving Deep Neural Network Based Speech Enhancement in Low SNR Environments. LVA/ICA 2015: 75-82 - [c9]Yong Xu, Jun Du, Zhen Huang, Li-Rong Dai, Chin-Hui Lee:
Multi-objective learning and mask-based post-processing for deep neural network based speech enhancement. INTERSPEECH 2015: 1508-1512 - [c8]Kehuang Li, Zhen Huang, Yong Xu, Chin-Hui Lee:
DNN-based speech bandwidth expansion and its application to adding high-frequency missing features for automatic speech recognition of narrowband speech. INTERSPEECH 2015: 2578-2582 - 2014
- [j1]Yong Xu, Jun Du, Li-Rong Dai, Chin-Hui Lee:
An Experimental Study on Speech Enhancement Based on Deep Neural Networks. IEEE Signal Process. Lett. 21(1): 65-68 (2014) - [c7]Yong Xu, Jun Du, Li-Rong Dai, Chin-Hui Lee:
Global variance equalization for improving deep neural network based speech enhancement. ChinaSIP 2014: 71-75 - [c6]Jun Du, Qing Wang, Tian Gao, Yong Xu, Li-Rong Dai, Chin-Hui Lee:
Robust speech recognition with speech enhanced deep neural networks. INTERSPEECH 2014: 616-620 - [c5]Yong Xu, Jun Du, Li-Rong Dai, Chin-Hui Lee:
Dynamic noise aware training for speech enhancement based on deep neural networks. INTERSPEECH 2014: 2670-2674 - [c4]Yanhui Tu, Jun Du, Yong Xu, Li-Rong Dai, Chin-Hui Lee:
Speech separation based on improved deep neural networks with dual outputs of speech features for both target and interfering speakers. ISCSLP 2014: 250-254 - [c3]Yong Xu, Jun Du, Li-Rong Dai, Chin-Hui Lee:
Cross-language transfer learning for deep neural network based speech enhancement. ISCSLP 2014: 336-340 - 2012
- [c2]Yong Xu, Wu Guo, Shan Su, Li-Rong Dai:
Spoken term detection for OOV terms based on triphone confusion matrix. ISCSLP 2012: 98-102 - [c1]Yong Xu, Wu Guo, Li-Rong Dai:
A hybrid fragment / syllable-based system for improved OOV term detection. ISCSLP 2012: 378-382
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-11-20 21:58 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint