I am a PhD student at Harvard University, advised by Prof. Todd Zickler.
I also work closely with Prof. Ko Nishino and have spent two wonderful summers in Kyoto.
Email: xinranhan [at] g [dot] harvard [dot] edu
Previously, I graduated from the University of Pennsylvania majoring in Mathematics and Computer Science.
During my undergrad, I was fortunate to work with Prof. Jianbo Shi
and Prof. Dan Roth.
My current research interests include computer vision, generative models and multi-modal learning.
Specifically, I enjoy building models that combine learning based approach with physics modeling and inspirations
from the human visual system. I'm also interested in the intersection of vision with language and art.
Some papers are highlighted.
We introduce derivative representation alignment (dREPA) for image-to-video generation and show it improves
subject consistency and leads to better generalization across artistic styles.
We show that a novel pixel-space video diffusion model trained from scratch estimates accurate
shape and material from short videos, and also produces diverse shape and material samples for
ambiguous input images.
We present a bottom-up, patch-based diffusion model for monocular shape from shading that produces multimodal outputs,
similar to multistable perception in humans.
We present new theoretical insight on the equivalence of multi-task and single-task learning
for stationary kernels and develop MPHD for model pre-training on heterogeneous domains.
We present a neural model for inferring a curvature field from shading images that is invariant under lighting and texture variations,
drawing on perceptual insights and mathematical derivations.