Follow
Zhiyong WU (吴志勇)
Zhiyong WU (吴志勇)
Associate Professor, Tsinghua University
Verified email at sz.tsinghua.edu.cn - Homepage
Title
Cited by
Cited by
Year
A review of deep learning based speech synthesis
Y Ning, S He, Z Wu, C Xing, L Zhang
Applied Sciences 9 (19), 4050, 2019
1662019
Speech emotion recognition using capsule networks
X Wu, S Liu, Y Cao, X Li, J Yu, D Dai, X Ma, S Hu, Z Wu, X Liu, H Meng
ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and …, 2019
1182019
Emotion recognition from variable-length speech segments using deep learning on spectrograms
X Ma, Z Wu, J Jia, M Xu, H Meng, L Cai
INTERSPEECH 2018, 3683-3687, 2018
982018
Multi-level fusion of audio and visual features for speaker identification
Z Wu, L Cai, H Meng
Advances in Biometrics, 493-499, 2006
922006
MFA-Conformer: Multi-scale Feature Aggregation Conformer for Automatic Speaker Verification
Y Zhang, Z Lv, H Wu, S Zhang, P Hu, Z Wu, H Lee, H Meng
Proc. Interspeech 2022, 306-310, 2022
912022
A deep recurrent approach for acoustic-to-articulatory inversion
P Liu, Q Yu, Z Wu, S Kang, H Meng, L Cai
2015 IEEE International Conference on Acoustics, Speech and Signal …, 2015
862015
Question detection from acoustic features using recurrent neural network with gated recurrent unit
Y Tang, Y Huang, Z Wu, H Meng, M Xu, L Cai
2016 IEEE International Conference on Acoustics, Speech and Signal …, 2016
812016
Dilated Residual Network with Multi-head Self-attention for Speech Emotion Recognition
R Li, Z Wu, J Jia, S Zhao, H Meng
ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and …, 2019
782019
FullSubNet+: Channel Attention Fullsubnet with Complex Spectrograms for Speech Enhancement
J Chen, Z Wang, D Tuo, Z Wu, S Kang, H Meng
ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022
722022
Automatic lexical stress and pitch accent detection for L2 English speech using multi-distribution deep neural networks
K Li, S Mao, X Li, Z Wu, H Meng
Speech Communication 96, 28-36, 2018
632018
Real-time synthesis of Chinese visual speech and facial expressions using MPEG-4 FAP features in a three-dimensional avatar.
Z Wu, S Zhang, L Cai, HM Meng
INTERSPEECH, 1802-1805, 2006
612006
Emotion controllable speech synthesis using emotion-unlabeled dataset with the assistance of cross-domain speech emotion recognition
X Cai, D Dai, Z Wu, X Li, J Li, H Meng
ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and …, 2021
592021
Learning Discriminative Features from Spectrograms Using Center Loss for Speech Emotion Recognition
D Dai, Z Wu, R Li, X Wu, J Jia, H Meng
ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and …, 2019
582019
Speech-XLNet: Unsupervised Acoustic Model Pretraining For Self-Attention Networks
X Song, G Wang, Y Huang, Z Wu, D Su, H Meng
Proc. Interspeech 2020, 3765-3769, 2020
552020
Modelling high-dimensional sequences with LSTM-RTRBM: application to polyphonic music generation
Q Lyu, Z Wu, J Zhu, H Meng
Proceedings of the 24th International Conference on Artificial Intelligence …, 2015
512015
Towards Multi-Scale Style Control for Expressive Speech Synthesis
X Li, C Song, J Li, Z Wu, J Jia, H Meng
Proc. Interspeech 2021, 4673-4677, 2021
462021
Towards Discriminative Representation Learning for Speech Emotion Recognition
R Li, Z Wu, J Jia, Y Bu, S Zhao, H Meng
Proceedings of the 28th International Joint Conference on Artificial …, 2019
462019
One-Shot Voice Conversion with Global Speaker Embeddings
H Lu, Z Wu, D Dai, R Li, S Kang, J Jia, H Meng
Proc. Interspeech 2019, 669-673, 2019
442019
End-to-end Code-switched TTS with Mix of Monolingual Recordings
Y Cao, X Wu, S Liu, J Yu, X Li, Z Wu, X Liu, H Meng
ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and …, 2019
432019
Head and facial gestures synthesis using PAD model for an expressive talking avatar
J Jia, Z Wu, S Zhang, HM Meng, L Cai
Multimedia Tools and Applications 73 (1), 439-461, 2014
432014
The system can't perform the operation now. Try again later.
Articles 1–20