1) Around 100-150 if memory serves. This scene is part of the mip-NeRF 360 benchmark, which you can download from the corresponding project website: https://jonbarron.info/mipnerf360/
2) Between 12 and 48 hours, depending on the scene. We train on 8x V100s or 16x A100s.
3) The time for preparing assets is included in 2). I don't have a breakdown for you, but it's something like 50/50.
4) Nope! A keen hacker might be able to do this themselves by editing the JavaScript code. Open your browser's DevTools and have a look -- the code is all there!
> Do you need position data to go along with the photos or just the photos?
Short answer: Yes.
Long answer: Yes, but it can typically be derived from images. Structure-from-motion methods are typically used to derive lens and position information for each photo in the training set. These are then used by Zip-NeRF (our teacher) and SMERF (our model) to train a model.
1) Around 100-150 if memory serves. This scene is part of the mip-NeRF 360 benchmark, which you can download from the corresponding project website: https://jonbarron.info/mipnerf360/
2) Between 12 and 48 hours, depending on the scene. We train on 8x V100s or 16x A100s.
3) The time for preparing assets is included in 2). I don't have a breakdown for you, but it's something like 50/50.
4) Nope! A keen hacker might be able to do this themselves by editing the JavaScript code. Open your browser's DevTools and have a look -- the code is all there!