Neural 3D Video Synthesis from Multi-view Video

CVPR 2022 (oral)

Tianye Li1,3,*    Mira Slavcheva1,*    Michael Zollhoefer1   
Simon Green1    Christoph Lassner1    Changil Kim2    Tanner Schmidt1   
Steven Lovegrove1    Michael Goesele1    Richard Newcombe1    Zhaoyang Lv1
1Reality Labs Research
2Meta
3University of Southern California
*equal contributions
[Paper]
[ArXiv]
[Sup. Mat.]
[Data]


Abstract

We propose a novel approach for 3D video synthesis that is able to represent multi-view video recordings of a dynamic real-world scene in a compact, yet expressive representation that enables high-quality view synthesis and motion interpolation. Our approach takes the high quality and compactness of static neural radiance fields in a new direction: to a model-free, dynamic setting. At the core of our approach is a novel time-conditioned neural radiance fields that represents scene dynamics using a set of compact latent codes. To exploit the fact that changes between adjacent frames of a video are typically small and locally consistent, we propose two novel strategies for efficient training of our neural network: 1) An efficient hierarchical training scheme, and 2) an importance sampling strategy that selects the next rays for training based on the temporal variation of the input videos. In combination, these two strategies significantly boost the training speed, lead to fast convergence of the training process, and enable high quality results. Our learned representation is highly compact and able to represent a 10 second 30 FPS multi-view video recording by 18 cameras with a model size of just 28MB. We demonstrate that our method can render high-fidelity wide-angle novel views at over 1K resolution, even for highly complex and dynamic scenes. We perform an extensive qualitative and quantitative evaluation that shows that our approach outperforms the current state of the art. Project website: https://neural-3d-video.github.io.


Video



Acknowledgements

We thank Rick Szeliski, Anton Kaplanyan, Brian Cabral, Zhao Dong, Samir Aroudj for providing feedback to this project, Daniel Andersen for helping with photorealism evaluation, Joey Conrad and Shobhit Verma for designing and building our capture rig. We thank the website template from here.