Home

Greetings! I am a research scientist at Google Deepmind. I obtained my Ph.D. and M.S. at Carnegie Mellon University School of Computer Science. I graduated summa cum laude from Peking University major in Computer Science as well as Economics. My research interests lie around multi-modal foundation models, especially for video generation.

News

  • [04/2024] Invited talks at HK-SH AI Forum, NYU, CalTech, HKUST, ICT CAS, Adobe, ByteDance, Baidu, etc.
  • [12/2023] Introducing VideoPoet, a large language model for zero-shot video generation, enabled by MAGVIT-v2 tokenizer.
  • [12/2023] Introducing W.A.L.T, a latent video diffusion transformer, enabled by MAGVIT-v2.

Selected Publications

VideoPoet: A Large Language Model for Zero-Shot Video Generation

Dan Kondratyuk*, Lijun Yu*, Xiuye Gu*, José Lezama*, Jonathan Huang*, Grant Schindler, Rachel Hornung, Vighnesh Birodkar, Jimmy Yan, Ming-Chang Chiu, Krishna Somandepalli, Hassan Akbari, Yair Alon, Yong Cheng, Josh Dillon, Agrim Gupta, Meera Hahn, Anja Hauth, David Hendon, Alonso Martinez, David Minnen, Mikhail Sirotenko, Kihyuk Sohn, Xuan Yang, Hartwig Adam, Ming-Hsuan Yang, Irfan Essa, Huisheng Wang, David A. Ross, Bryan Seybold*, Lu Jiang* (*Equal contribution). In ICML, 2024

Selected Talks

Selected Repos