HyperReel | Supplemental Results

Benjamin Attal

Meta
Carnegie Mellon University

Jia-Bin Huang

Meta
University of Maryland

Christian Richardt

Reality Labs Research

Michael Zollhöfer

Reality Labs Research

Johannes Kopf

Meta

Real-Time Demos ↑

We show our real-time demo applied to a several scenes at 512x512 pixel resolution.

Datasets. We use 2 scenes from each of: Technicolor, Google Immersive, Shiny, Stanford, and DoNeRF

Details. We use our Small model for the Technicolor and Shiny scenes, so that the frame-rate for rendering is >40 FPS, demonstrating that we can achieve very high visual quality at high-frame rate, even without custom CUDA code. For the remaining datasets, we use our full model, which still provides real-time inference.

Technicolor Results ↑

Neural 3D Video Results ↑

We show free view synthesis results on all scenes from the Neural 3D Video dataset.

Note that the attached renderings are from 50 frame video clips of the original full length sequences, due to supplemental space constraints (though all of our quantitative metrics are calculated using the full length videos).

Google Immersive Results ↑

We show free view synthesis results on all scenes from the Google Immersive dataset.

View Dependent Qualitative Results (Videos) ↑

We show free view synthesis results on 4 highly view-dependent scenes containing both challenging reflections and refractions.

Datasets. We use 2 scenes from the Shiny Dataset—CD and Lab— and two from the Stanford dataset—Tarot (Small) and Tarot (Large).

View Dependent Qualitative Results (Images) ↑

We provide qualitative comparisons to other static view synthesis methods on highly view-dependent scenes.

Datasets. We use 2 scenes from the Shiny Dataset—CD and Lab.

Baselines. We better reproduce reflections and refractions than both NeRF and NeX, while performing comparably to Light Field Neural Rendering [Suhail et al. 2022] in terms of quality.

Our method can also produce renderings of challenging content at real-time rates (see our demo), while Light Field Neural rendering [Suhail et al. 2022] takes 60 seconds to render an 800x800 image on a V100 GPU.

Google immersive Qualitative Comparison ↑

We provide qualitative comparisons to Google Immersive Light Field Video [Broxton et al. 2022].

We take screenshots from Google Immersive's high-resolution web-viewer, as there is no easy way to render specific viewpoints.

Our method provides comparable visual quality at a lower memory cost, while also training in hours rather than days.

HyperReel: High-Fidelity 6-DoF Video with Ray-Conditioned Sampling

Supplemental Results