I am a PhD student at Harvard University, advised by Prof. Todd Zickler.
I also work closely with Prof. Ko Nishino and have spent two wonderful summers in Kyoto.
Email: xinranhan [at] g [dot] harvard [dot] edu
Previously, I graduated from the University of Pennsylvania majoring in Mathematics and Computer Science.
During my undergrad, I was fortunate to work with Prof. Jianbo Shi
and Prof. Dan Roth.
My current research interests include computer vision, generative models and multi-modal learning.
Specifically, I enjoy building models that combine learning based approach with physics modeling and inspirations
from the human visual system. I'm also interested in the intersection of vision with language and art.
Some papers are highlighted.
We show that applying derivative representation alignment (dREPA) on image to video generation model improves subject consistency
and leads to better generalization to different artistic styles.
We introduce a pixel-space video diffusion backbone with hybrid local–global attention that, from just a few frames of an object in motion,
simultaneously estimates plausible shape and material.
We present a bottom-up, patch-based diffusion model for monocular shape from shading that produces multimodal outputs,
similar to multistable perception in humans.
We present new theoretical insight on the equivalence of multi-task and single-task learning
for stationary kernels and develop MPHD for model pre-training on heterogeneous domains.
We present a neural model for inferring a curvature field from shading images that is invariant under lighting and texture variations,
drawing on perceptual insights and mathematical derivations.