Home
Greetings! I am a Senior Research Scientist at Google DeepMind working on Gemini and Veo. I (co-)lead the Veo Reinforcement Learning effort and Gemini-native Video Generation, and am a core contributor on Gemini Image Generation (Nano Banana), Post-Training, Omni, and Vision. I obtained my Ph.D. and M.S. at Carnegie Mellon University School of Computer Science. I graduated summa cum laude from Peking University major in Computer Science as well as Economics. My research interests lie around multi-modal foundation models, especially for video generation.
News
- See latest updates on my LinkedIn, such as Veo 3.1, Nano Banana, Gemini 2.5, and Deep Think.
- [07/2024] VideoPoet receives Best Paper Award at ICML 2024. Watch the talk.
- [04/2024] Invited talks at HK-SH AI Forum, NYU, CalTech, HKUST, ICT CAS, Adobe, ByteDance, Baidu, etc.
- [12/2023] Introducing VideoPoet, a large language model for zero-shot video generation, enabled by MAGVIT-v2 tokenizer.






