Kaihua Chen

I’m a Master’s student in Computer Vision (MSCV) at the Robotics Institute, School of Computer Science, Carnegie Mellon University, advised by Prof. Deva Ramanan. My current research broadly focuses on diffusion generative models and 3D/4D vision, and I am also interested in exploring robotics and VLMs. Previously, I was fortunate to work as a research intern in the Lin Brain Lab at the University of Toronto during my undergraduate studies at China Agricultural University.

Email / Scholar / Github / Twitter

Research

I’m still exploring what the best generative formulation and 3D representation might be, and how they can be applied to virtual reality, robotics, and other practical scenarios.

	Reconstruct, Inpaint, Finetune: Dynamic Novel-view Synthesis from Monocular Videos Kaihua Chen ^* , Tarasha Khurana ^, Deva Ramanan NeurIPS*, 2025 project page / arXiv / code We reformulate novel-view synthesis as a structured inpainting task. CogNVS is a video diffusion model for dynamic novel-view synthesis trained in a self-supervised manner using only 2D videos!
	Using Diffusion Priors for Video Amodal Segmentation Kaihua Chen, Deva Ramanan, Tarasha Khurana CVPR, 2025 project page / arXiv / code Given a modal (visible) object sequence in a video, we develop a two-stage method that generates its amodal (visible + invisible) masks and RGB content via video diffusion.
	Metric from Human: Zero-shot Monocular Metric Depth Estimation via Test-time Adaptation Yizhou Zhao, Hengwei Bian, Kaihua Chen, Pengliang Ji, Liao Qu, Shao-yu Lin, Weichen Yu, Haoran Li, Hao Chen, Jun Shen, Bhiksha Raj, Min Xu, NeurIPS, 2024 project page / paper / code MfH converts relative depth estimation to metric depth estimation via generative painting and human mesh recovery.
	TAO-Amodal: A Benchmark for Tracking Any Object Amodally Cheng-Yen Hsieh, Kaihua Chen, Achal Dave, Tarasha Khurana, Deva Ramanan arXiv, 2024 project page / arXiv / code / dataset We introduce TAO-Amodal, an amodal tracking dataset featuring 833 diverse categories in thousands of video sequences.

Miscellaneous

In my spare time, I enjoy watching movies 🍿 and playing soccer ⚽️.

Template borrowed from Jon Barron.
Last updated: Apr 23, 2026