UMETTS: A Unified Framework for Emotional Text-to-Speech Synthesis with Multimodal Prompts
Xiang Li, Zhi-Qi Cheng, Jun-Yan He, Junyao Chen, Xiaomao Fan, Xiaojiang Peng, Alexander G Hauptmann (2025). "UMETTS: A Unified Framework for Emotional Text-to-Speech Synthesis with Multimodal Prompts; 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).