Opencpop: A high-quality open source chinese popular song corpus for singing voice synthesis Y Wang, X Wang, P Zhu, J Wu, H Li, H Xue, Y Zhang, L Xie, M Bi arXiv preprint arXiv:2201.07429, 2022 | 61 | 2022 |
Visinger: Variational inference with adversarial learning for end-to-end singing voice synthesis Y Zhang, J Cong, H Xue, L Xie, P Zhu, M Bi ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022 | 53 | 2022 |
Promptstyle: Controllable style transfer for text-to-speech with natural language descriptions G Liu, Y Zhang, Y Lei, Y Chen, R Wang, Z Li, L Xie arXiv preprint arXiv:2305.19522, 2023 | 16 | 2023 |
Multi-speaker expressive speech synthesis via multiple factors decoupling X Zhu, Y Lei, K Song, Y Zhang, T Li, L Xie ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023 | 8 | 2023 |
Dspgan: a gan-based universal vocoder for high-fidelity tts by time-frequency domain supervision from dsp K Song, Y Zhang, Y Lei, J Cong, H Li, L Xie, G He, J Bai ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023 | 8 | 2023 |
Learn2sing 2.0: Diffusion and mutual information-based target speaker SVS by learning from singing teacher H Xue, X Wang, Y Zhang, L Xie, P Zhu, M Bi arXiv preprint arXiv:2203.16408, 2022 | 8 | 2022 |
Visinger 2: High-fidelity end-to-end singing voice synthesis enhanced by digital signal processing synthesizer Y Zhang, H Xue, H Li, L Xie, T Guo, R Zhang, C Gong arXiv preprint arXiv:2211.02903, 2022 | 7 | 2022 |
AccentSpeech: learning accent from crowd-sourced data for target speaker TTS with accents Y Zhang, Z Wang, P Yang, H Sun, Z Wang, L Xie 2022 13th International Symposium on Chinese Spoken Language Processing …, 2022 | 6 | 2022 |
Promptspeaker: Speaker Generation Based on Text Descriptions Y Zhang, G Liu, Y Lei, Y Chen, H Yin, L Xie, Z Li 2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), 1-7, 2023 | 3 | 2023 |
Accent-VITS: accent transfer for end-to-end TTS L Ma, Y Zhang, X Zhu, Y Lei, Z Ning, P Zhu, L Xie National Conference on Man-Machine Speech Communication, 203-214, 2023 | 1 | 2023 |
The NPU-MSXF Speech-to-Speech Translation System for IWSLT 2023 Speech-to-Speech Translation Task K Song, P Chen, Y Cao, K Wei, Y Zhang, L Xie, N Jiang, G Zhao arXiv preprint arXiv:2307.04630, 2023 | 1 | 2023 |
Robust MelGAN: A robust universal neural vocoder for high-fidelity TTS K Song, J Cong, X Wang, Y Zhang, L Xie, N Jiang, H Wu 2022 13th International Symposium on Chinese Spoken Language Processing …, 2022 | 1 | 2022 |
AdaVITS: Tiny VITS for Low Computing Resource Speaker Adaptation K Song, H Xue, X Wang, J Cong, Y Zhang, L Xie, B Yang, X Zhang, D Su 2022 13th International Symposium on Chinese Spoken Language Processing …, 2022 | 1 | 2022 |
METTS: Multilingual Emotional Text-to-Speech by Cross-Speaker and Cross-Lingual Emotion Transfer X Zhu, Y Lei, T Li, Y Zhang, H Zhou, H Lu, L Xie IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2024 | | 2024 |