Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Plus, you've got the very large challenge of learning a rich, high-quality 3D representation from a very small pool of 3D data. The volume of 3D data is just so small, compared to the volumes generative models really need to begin to shine.

Isn’t the entire aim of world models (at least, in this particular case) to learn a very high quality 3D representation from 2D video data? My point is if that you manage to train a navigable world model for a particular location, that model has managed to fit a very high quality 3D representation of that location. There’s lots of research dealing with NERFs that demonstrate how you can extract these 3D scenes as meshes once a model has managed to fit it. (NERFs are another great example of learning a high quality 3D representation from sparse 2D data.)

>That said, our belief is that model-imagined experiences are going to become a totally new form of storytelling, and that these experiences might not be free to be as weird and whacky as they could because of heuristics or limitations in existing 3D engines. This is our focus, and why the model is video-in and video-out.

There’s a lot of focus in the material on your site about the models learning physics by training on real world video - wouldn’t that imply that you’re trying to converge on a physically accurate world model? I imagine that would make weirdness and wackiness rather difficult

> To be clear, we don't yet know what shape these new experiences will take. I'm hoping we can avoid an awkward initial phase where these experiences resemble traditional game mechanics too much (although we have much to learn from them), and just fast-forward to enabling totally new experiences that just aren't feasible with existing technologies and budgets. Let's see!

I see! Do you have any ideas about the kinds of experiences that you would want to see or experience personally? For me it’s hard to imagine anything that substantially deviates from navigating and interacting with a 3D engine, especially given it seems like you want your world models to converge to be physically realistic. Maybe you could prompt it to warp to another scene?



  > wouldn’t that imply that you’re trying to converge on a physically accurate world model?
I'm not the CEO or associated with them at all, but yes, this is what most of these "world model" researchers are aiming for. As a researcher myself, I do not think this is the way to develop a world model and I'm fairly certain that this cannot be done through observations alone. I explain more in my response to the CEO[0]. This is a common issue is many ways that ML is experimenting, and you simply cannot rely on benchmarks to get you to AGI. Scaling of parameters and data only go so far. If you're seeing slowing advancements, it is likely due to over reliance on benchmarks and under reliance on what benchmarks intend to measure. But this is a much longer conversation (I think I made a long comment about it recently, I can dig up).

[0] https://news.ycombinator.com/item?id=44147777




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: