๐ Hi!
I am Yujie Wei (ๅซๆฑๆฐ), a third-year Ph.D. student at Fudan University, advised by Prof. Hongming Shan. I received my Bachelorโs degree in Software Engineering from Sichuan University, advised by Prof. Yi Zhang.
My research interests include 2D/3D Generative Models (especially Video Generation) and Self-Supervised Learning.
๐ฅ News
- 2024.09: ๐ EvolveDirector is accepted by NeurIPS 2024. Congrats to Rui.
- 2024.02: ๐ DreamVideo, InstructVideo, HiGen are accepted by CVPR 2024. Honored to collaborate with them on these promising projects.
- 2023.08: ๐ Emo-DNA is accepted by ACM MM 2023. Congrats to Jiaxin.
- 2023.07: ๐ OnPro is accepted by ICCV 2023.
- 2023.02: ๐ Temporal Modeling Matters is accepted by ICASSP 2023. Congrats to Jiaxin.
๐ Publications
Video Generation
DreamVideo-2: Zero-Shot Subject-Driven Video Customization with Precise Motion Control
Yujie Wei, Shiwei Zhang, Hangjie Yuan, Xiang Wang, Haonan Qiu, Rui Zhao, Yutong Feng, Feng Liu, Zhizhong Huang, Jiaxin Ye, Yingya Zhang, Hongming Shan
- DreamVideo-2 is the first zero-shot (tuning-free) framework that generates customized videos with specified subjects and motion trajectories.
DreamVideo: Composing Your Dream Videos with Customized Subject and Motion
Yujie Wei, Shiwei Zhang, Zhiwu Qing, Hangjie Yuan, Zhiheng Liu, Yu Liu, Yingya Zhang, Jingren Zhou, Hongming Shan
- DreamVideo is the first method that generates customized videos from a few static images of the desired subject and a few videos of target motion.
InstructVideo: Instructing Video Diffusion Models with Human Feedback
Hangjie Yuan, Shiwei Zhang, Xiang Wang, Yujie Wei, Tao Feng, Yining Pan, Yingya Zhang, Ziwei Liu, Samuel Albanie, Dong Ni
- InstructVideo is the first research attempt that instructs video diffusion models with human feedback.
Hierarchical Spatio-Temporal Decoupling for Text-to-Video Generation
Zhiwu Qing, Shiwei Zhang, Jiayu Wang, Xiang Wang, Yujie Wei, Yingya Zhang, Changxin Gao, Nong Sang
- HiGen is a method that improves T2V performance by decoupling the spatial and temporal factors from the structure and content level.
Image Generation
EvolveDirector: Approaching Advanced Text-to-Image Generation with Large Vision-Language Models
Rui Zhao, Hangjie Yuan, Yujie Wei, Shiwei Zhang, Yuchao Gu, Lingmin Ran, Xiang Wang, Zhangjie Wu, Junhao Zhang, Yingya Zhang, Mike Zheng Shou
- EvolveDirector explores the feasibility of training a text-to-image generation model comparable to advanced models using publicly available resources.
Continual Learning
Online Prototype Learning for Online Continual Learning
Yujie Wei, Jiaxin Ye, Zhizhong Huang, Junping Zhang, Hongming Shan
- OnPro is the first work to identify shortcut learning as the key limiting factor for online continual learning, offering new insights into why online learning models fail to generalize well.
Speech Emotion Recognition
ACM MM 2023
Emo-DNA: Emotion Decoupling and Alignment Learning for Cross-Corpus Speech Emotion Recognition, Jiaxin Ye, Yujie Wei, Xin-Cheng Wen, Chenglong Ma, Zhizhong Huang, Kunhong Liu, Hongming Shan.ICASSP 2023
Temporal Modeling Matters: A Novel Temporal Emotional Modeling Approach for Speech Emotion Recognition, Jiaxin Ye, Xin-Cheng Wen, Yujie Wei, Yong Xu, Kunhong Liu, Hongming Shan.
๐ Honors and Awards
- 2022.09 Fudan University Zhicheng Freshman Second Prize Scholarship (Top 5%)
- 2022.05 Outstanding Graduates of Sichuan Province and Sichuan University
- 2021.10 National Scholarship (Top 1%)
- 2020.10 The First Prize Scholarship (Top 3%)
- 2020.05 Sichuan University Top 100 Student Leaders
- 2019.10 National Scholarship (Top 1%)
๐ Academic Service
- Conference Reviewer: ICLR 2025, CVPR 2025.
๐ฌ Invited Talks
- 2024.11, Customized Image & Video Generation, 3D่ง่งๅทฅๅ & 3DCV & ่ฎก็ฎๆบ่ง่งๅทฅๅ | [Link] | [Video]
๐ Educations
- 2022.09 - 2027.06 (now), Ph.D., Fudan University, Shanghai, China.
- 2018.09 - 2022.06, Bachelor of Software Engineering, Sichuan Univeristy, Chengdu, China.