Template-Free Single-View 3D Human Digitalization with Diffusion-Guided LRM

Human-LRM predicts human Neural Radiance Fields (NeRF) from a single image in a feed-forward manner.
(Face blurred for anonymity.)

arXiv

Overview

Comparisons to SoTA on In-the-Wild Images

Input Image
Human-LRM
SIFU (CVPR 24)
GTA (Neurips 23)

High Resolution Figures from the Paper



pipeline

Fig. 1. We present Human-LRM, a template-free large reconstruction model for feed-forward 3D human digitalization from a single image. Trained on a vast dataset comprising multi-view capture and 3D scans, our model generalizes across a broader range of scenarios. Guided by dense novel views generated by a conditional diffusion model, our model can generate high-fidelity full body humans from a single image.





pipeline

Fig 2. Comparison of Human-LRM with SoTA single-view human reconstruction methods on in-the-wild images. Compared to volumetric reconstruction methods, our method achieves superior generalizability to challenging poses (a) and higher fidelity appearance prediction (b). Compared to generalizable human NeRF methods (c), our result achieves much better geometry quality.





pipeline

Fig. 4: Geometry and appearance comparison with PIFu, GTA and SIFU on in-the-wild images.





pipeline

Fig. 5: Comparison of our single-view reconstruction model to previous volumetric reconstruction methods: PIFu, PIFu-HD, ECON, LRM, GTA, and SIFU. All models are trained on THuman 2.0. For each example we show the geometry (colored by vertex normals) from 4 views.





pipeline

Fig. 6: Novel view renderings results on HuMMan v1.0.





pipeline

Fig. 8: Example novel view results after each stage. Results for Stage I and Stage III are mesh renderings. Results for Stage II are diffusion model outputs (i.e. images).





pipeline

Figure S2. Depth comparison to HDNet, ZoeDepth and DPT. Red color means the region is close.





pipeline

Figure S3. Normal comparison to HDNet.

Additional Geometry Comparisons on In-the-Wild Images



Additional Results on In-the-Wild Videos





BibTeX

@article{humanlrm2023,
      author    = {Zhenzhen Weng and Jingyuan Liu and Hao Tan and Zhan Xu and Yang Zhou and Serena Yeung-Levy and Jimei Yang},
      title     = {Template-Free Single-View 3D Human Digitalization with Diffusion-Guided LRM},
      journal   = {Preprint},
      year      = {2023},
  }