Publications

You can also find my articles on my Google Scholar profile.

Conference Papers


UMETTS: A Unified Framework for Emotional Text-to-Speech Synthesis with Multimodal Prompts

Published in ICASSP, 2025

The paper introduces UMETTS, a multimodal emotional text-to-speech (E-TTS) framework that leverages emotional cues from text, audio, and visual inputs. The proposed system incorporates an Emotion Prompt Alignment Module (EP-Align) and an Emotion Embedding-Induced TTS Module (EMI-TTS) to generate expressive and emotionally resonant speech.

Recommended citation: Xiang Li, Zhi-Qi Cheng, Jun-Yan He, Junyao Chen, Xiaomao Fan, Xiaojiang Peng, Alexander G Hauptmann (2025). "UMETTS: A Unified Framework for Emotional Text-to-Speech Synthesis with Multimodal Prompts; 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
Download Paper

Let Your Features Tell The Differences: Understanding Graph Convolution By Feature Splitting

Published in ICLR, 2025

This paper proposes a new metric to identify GNN-favored and GNN-disfavored features and use topological feature selection to fuse these features into GNNs, which significantly improves GNNs performance without hyper-parameter tuning.

Recommended citation: Yilun Zheng, Xiang Li, Sitao Luan, Xiaojiang Peng, and Lihui Chen. (2025). "Let your features tell the differences: Understanding graph convolution by feature splitting." The Thirteenth International Conference on Learning Representations.
Download Paper

Semi-supervised multimodal emotion recognition with expression mae

Published in ACMMM, 2023

The Multimodal Emotion Recognition (MER 2023) challenge aims to recognize emotion with audio, language, and visual signals, facilitating innovative technologies of affective computing. This paper presents our submission approach on the Semi-Supervised Learning Sub-Challenge (MER-SEMI).

Recommended citation: Zebang Cheng, Yuxiang Lin, Zhaoru Chen, Xiang Li, Shuyi Mao, Fan Zhang, Daijun Ding, Bowen Zhang, Xiaojiang Peng. (2023). "Semi-supervised multimodal emotion recognition with expression mae." Proceedings of the 31st ACM International Conference on Multimedia.
Download Paper

Signatured Fingermark Recognition Based on Deep Residual Network

Published in CCBR, 2021

Traditional fingerprint recognition methods based on minutiae have shown great success on for high-quality fingerprint images. However, the accuracy rates are significantly reduced for signatured fingermark on the contract. This paper proposes a signatured fingermark recognition method based on deep learning.

Recommended citation: Yongliang Zhang, Qiuyi Zhang, Jiali Zou, Weize Zhang, Xiang Li, Mengting Chen, Yufan Lv. (2023). "Signatured Fingermark Recognition Based on Deep Residual Network." Biometric Recognition: 15th Chinese Conference.
Download Paper