Short Bio

I am a first-year Master’s student at the School of Electronic and Computer Engineering, Peking University , advised by Prof. Li Yuan and Prof. Yonghong Tian . I am also fortunate to work closely with Prof. Daquan Zhou. I received my B.E. degree from the University of Chinese Academy of Sciences (UCAS) in 2025, where I was awarded as an Outstanding Graduate.

My research interests primarily lie in video generation for real-world applications.

News

2025.05: 🎉🎉 I am selected as an Outstanding Graduate of Beijing, with my undergraduate thesis named an Outstanding Graduation Paper of UCAS.
2024.11: 🎉🎉 I win the China National Scholarship and First-Level Scholarship of UCAS!
2024.07: 🎉🎉 One paper is accepted by ECCV 2024.

Publications

Omni-Grid: Taming Image-based Unified Multimodal Models to Video

Juncheng Ma, Yuelin Li, Yufan Deng, Zhenyu Tang, Dongdong Yu, Li Yuan, Changhu Wang, Daquan Zhou†, Yonghong Tian†

Under Review

We present a cost-efficient approach to adapt image-based UMMs to the video domain while preserving their original strengths. Our model surpasses previous video-based UMMs on VBench, while also exhibiting zero-shot multimodal interleaved abilities in video, including CoT reasoning and text rendering.

FastAvatar: Accelerating Portrait Animation via Memory-Adaptive Caching

Juncheng Ma, Yuxuan Du, Yanan SUN, Zhening Xing, Changlin Li, Zhenyu Tang, Bo Li, Peng-Tao Jiang, Li Yuan, Daquan Zhou†, Yonghong Tian†

Under Review

We proposed SyncCache, a training-free, audio-aware, and memory-adaptive caching strategy tailored for audio-driven portrait animation, which delivers a 4.12× acceleration on Hunyuan-Avatar and 3.75× on Wan-S2V with negligible degradation in visual quality or audio alignment.

Stepping Stones: A Progressive Training Strategy for Audio-Visual Semantic Segmentation

Juncheng Ma, Peiwen Sun, Yaoting Wang, Di Hu

Rethink audio-visual semantic segmentatoin from a new perspective, with a progressive two-stage training strategy proposed to enhance the capability of audio-visual alignment .

Experiences

Research Intern, Shanghai AI Laboratory, Shanghai, China, 2024.7-2024.11

Research on video generation especially audio-driven portrait animation.

Supervised by Dr. Yanan Sun and Dr. Yanhong Zeng

Visiting Student, Renmin University of China, Beijing, China, 2023.10-2024.3

Research on multimodal learning especially audio-visual segmentation, aiming to segment sound sources within a video according to its corresponding audio.

One paper Stepping-Stones is accepted by ECCV2024.

Supervised by Prof. Di Hu

Honors and Awards

2025, Outstanding Graduation Paper of UCAS.
2025, Outstanding Graduate of Beijing.
2024, China National Scholarship.
2024, First-Level Scholarship of UCAS.

Educations

2025.09 - present, Master, Peking University.
2021.09 - 2025.6, Undergraduate, University of Chinese Academy of Sciences.