Hacker Newsnew | past | comments | ask | show | jobs | submit | maxloh's commentslogin

From Allen AI's Discord:

Introducing *MolmoSpaces*: A large-scale, fully open platform + benchmark for embodied AI research

The next wave of AI will act in the physical world, but building robots that generalize across new environments rather than simply replaying learned behaviors requires far more diverse training data than exists today. That's where MolmoSpaces comes in.

MolmoSpaces brings together 230k+ indoor scenes, 130k+ object models, and 42M annotated robotic grasps into a single open ecosystem built on two foundations:

◘ Objaverse, one of the largest open collections of 3D objects

◘ Our THOR family of interactive simulation environments

MolmoSpaces is grounded in physics simulation with validated physical parameters tuned for realistic robotics manipulation, and includes a trajectory-generation pipeline for reproducible embodied AI demonstrations and imitation learning at scale. All assets, scenes, and tools are open and modular – provided in MJCF with USD conversion for cross-simulator portability – so you can plug in new embodiments, regenerate grasps, and run experiments across MuJoCo, ManiSkill, and NVIDIA Isaac Lab/Sim.

MolmoSpaces supports teleoperation via mobile platforms like Teledex, so you can collect demonstrations right from your phone, compatible with embodiment setups including DROID and CAP with no extra configuration needed.

We're also releasing *MolmoSpaces-Bench*, a new benchmark for evaluating generalist policies under systematic, controlled variation. Researchers can isolate individual factors – object properties, layouts, task complexity, lighting, dynamics, instruction phrasing, and more – across thousands of realistic scenes.

Explore MolmoSpaces today and start building—we can't wait to see what the community does with it:

Blog: https://allenai.org/blog/molmospaces

Demo: https://molmospaces.allen.ai/

⬇ Code: https://github.com/allenai/molmospaces

Data: https://huggingface.co/datasets/allenai/molmospaces

Paper: http://allenai.org/papers/molmospaces



From Allen AI's Discord:

Introducing *How2Everything*—an open framework for benchmarking & improving how LLMs generate step-by-step procedures.

LLMs constantly produce instructions for everything from filing taxes to plans for AI agents, but improving this capability is challenging. Outputs can sound fluent while describing steps that don't actually work, surface-level metrics miss critical mistakes like omitted prerequisites or contradictory instructions, and manual verification doesn't scale.

How2Everything closes this gap with a practical loop: mine real procedures from the web → benchmark LLM outputs → detect critical failures (missing steps, wrong order, omissions) → use that signal to train better models.

It has three main components:

*How2Mine*—a pipeline that extracts & standardizes procedures from web pages covering 14 topics

*How2Bench*—a 7,000-procedure benchmark built from How2Mine

*How2Score*—an evaluation protocol powered by How2Judge, an open 8B judge model trained to flag critical failures

How2Judge agrees with human judgments ~80% of the time and is cheap enough for large-scale eval, making it practical as both a benchmark scorer and an RL reward signal.

RL training with How2Score yields >10-point gains on Qwen3 4B, Qwen3 8B, and Olmo 3 7B Think, with no regressions across 12 standard benchmarks covering knowledge, reasoning, chat, math, and code. How2Bench also scales cleanly, remaining informative from early 1B pretraining checkpoints through frontier LLMs. And we stress-tested two shortcut explanations (format compliance and memorization); neither accounts for the improvements, pointing to real gains in procedure generation.

The full How2Everything framework, including How2Judge, is available now.

Blog: https://allenai.org/blog/how2everything

Paper: https://arxiv.org/pdf/2602.08808

Code: https://github.com/lilakk/how2everything

HF: https://huggingface.co/collections/how2everything/how2everyt...


Scaling horizontally is significantly cheaper than the additional engineering cost required to build these applications in statically typed languages, especially in developed nations like France.

The real bottleneck lies on the database side, but it is rare for an average organization to actually hit its limits. Don't think at Microsoft scale if you aren't them.


Server costs actually matter quite a bit at the scales of the incumbents in this space. Also, speed can be an important part of UX. Scaling horizontally won’t help if the engine itself is slow enough that there is noticeable lag even with just a single document getting edited by a dozen people.

I don't really get it.

While these functions sound useful, the logic seems simple enough that you could just prompt an LLM with the documentation and get a working codebase without actually subscribing to the hosted service.

Where is the commercial value of it?


One thing I’ve learned over the years is that people don't necessarily vote for the "best" candidate. Instead, they vote for the candidate who is "least bad" and do the minimum amount of damage to their interests. It is always a matter of compromise.

As a counter-example, you cannot expect an LGBT person to vote for a right-wing conservative who advocates against their own rights, even if that candidate makes the "right call" on every other issue.


>As a counter-example, you cannot expect an LGBT person to vote for a right-wing conservative who advocates against their own rights, even if that candidate makes the "right call" on every other issue.

I can't think of a candidate that fits this description.


Yeah, isn’t the point of extreme right to make the wrong calls for the majority of the population?

Historically, the right side was pro-monarchy. Then, you go extreme.


Terms like left and right only have meaning in one place at one time. So just because European conservatives 100 years ago believed something doesn't mean American conservatives today believe in that thing. That's why political scientists have terms like socialist, fascists, libertarian, etc. That's how US right (libertarian) is basically nothing like the right in Europe (conservative). That's because the basic axis of differences in the US is larger vs smaller government and in Europe it is completely different as both sides like larger government. I have tried to explain this to many Europeans over the years; somehow you are all allergic to understanding it. Its probably the only thing you all have in common.

The French Revolution, when left and right started, was 237 years ago by the way.

I agree about the scale being different across countries. The american politics would be only right on a common European left/right scale.

If everyone doesn’t understand, have you considered that another thing in common is what you try to explain?


> That's how US right (libertarian)

Except that the US right is not libertarian. If you ask them to describe themselves they often give that impression, but if you look at how they actually govern, libertarian is definitely not it.



I wonder how the current events in Greenland will impact the safety and sovereignty of Taiwan.

The US is Taiwan’s most important military ally, even if that relationship remains unofficial. It is also the most critical power in the First Island Chain. If the US stopped being a global superpower, countries like Japan and South Korea might not be willing to aid in defending Taiwan on their own.


I wonder how the current events in Greenland will impact the safety and sovereignty of Taiwan.

That was my thought as well. It's a dangerous rhetoric being displayed by USA. "We need this land for our security". Turns out, what if other powers start using the same rhetoric? Russia did it already for Ukraine, China might say "We need Taiwan for our security".. where does it stop and ultimately it leads absolutely nowhere good.


Diplomatic relationships are rarely about justice, because they are almost always about power and influence.

In fact, the US and its allies have been the only major powers advocating for a "rules-based international order." On the other side, you have Russia annexing Crimea in 2014, and China building artificial islands in the South China Sea to forcefully claim territory that isn't theirs under international law. Not to mention that all authoritarian states, by their very nature, are a clear violation of the UN Universal Declaration of Human Rights, which defines democracy and freedom of speech as basic human rights.

But at the same time, the US doesn't need a moral justification to sanction China over AI hardware. It is, as always, about power and influence.

The worrying part is that the US is losing its global influence by threatening an ally over Greenland. If they ever resort to military measures, they would lose all influence over the EU, and that would leave Taiwan in a very dangerous spot.


China already claims Taiwan, and has for decades; the only thing keeping it practically separate is uncertainty over the outcome in various dimensions if China tries to take it militarily. I don't think there's any doubt that if they were sure they could take it relatively bloodlessly and without significant repercussion, they would do so immediately.


The US recognizes Taiwan as part of China since the 70’s though its position is quite ambiguous! I found this document by the US congress that explains the history behind the rather bizarre situation Taiwan finds itself today: https://www.congress.gov/crs-product/IF12503


Nope. The US One China Policy (not to be confused with China's One China Principle) only "acknowledges" China's claim over Taiwan. The wording is intended to be vague so that each side can interpret the meaning according to their own interests (like China claiming "acknowledge" actually means "recognize").


You're agreeing with what I said. "Acknowledges" can be understood as "recognizes" but like I said, it's ambiguous intentionally (as you agreed).


> The US recognizes Taiwan as part of China ... though its position is quite ambiguous!

I wouldn't describe that position using the word "recognize". It is more accurate to use the official term "acknowledge" instead.


You're right, of course. What I'm saying is what happens if anyone with any lethal force proclaims they need territory which isn't theirs for their own security. Dangerous rhetoric and extremely dangerous precedent if this plays out.


Consider the following - Trump has tried again and again to make a business deal with dictators, regardless of the previous outcomes. And since he is in a steep mental decline he is not likely to change his ways fundamentally. He also repeatedly expressed dissatisfaction of having to protect "others" with USA army, at least for free as he sees it. He repeatedly tried to break NATO and break Ukrainian support.

I think it is likely that he wants to stop protecting Taiwan, give it up to China and then expect to make a deal with China to buy stuff manufactured on the island with money, afterwards. It would be totally in character for him and match his actual actions across the world.


True. Taiwan is an important ally, unofficially. The folks the US is feuding with right now are also allies, but officially. As are Japan and South Korea. It can't be encouraging.


The situation with Taiwan will explode because putinism is being normalized. Welcome to the dark era.


How do current events affect the US being a global superpower?


IMO, China will get back Taiwan without firing a single shot, the US is slowly de-risking itself from it and will eventually make Taiwan redundant. After seeing how the US is "helping" Ukraine, will the Taiwanese think fighting an all-out war with allies like this is worth it? China doesn't have the same genocidal intentions russia has towards Ukraine, so less reasons for people to fight it out

Edt: would love some arguments instead of downvotes


> will the Taiwanese think fighting an all-out war with allies like this is worth it?

What example do you know of a democratic country collectively "accepting" invasion by a dictatorship because being free is "not worth it"?

I can't really come up with anything.


Asking for an example is ill-posed, given that democracies are rather young constructs compared to the wider human history. Mind you, I am rooting for Taiwan, but I would expect something like what happened in Hong Kong rather than all-out war if the USA rug pulls Taiwan when it comes to support. Europe has already signaled that they won't do anything when it comes to Taiwan.


Maybe if Xi dies and the next guy is more reasonable. A lot of the animosity towards China is a result of Xi's authoritarian turn a decade or so ago...


That's true, we'll see if China is able to play the long game


The problem with Taiwanese (I am one) is ideological, they see themselves as too socially different than mainland China. Reliance on US support, or TSMC as another popular absurd copium, for security guarantee, is not realistic, and any Taiwanese can see this now. Absent other ways to secure its self determination, Taiwan is stuck playing a thin-line game between a crazy eagle and a very possessive panda.


I 100% agree with what you say, no discussion on that. My argument is that, if/when push comes to shove, Taiwanese leadership will pick the peace option given past US behaviour.


Taiwan is a completely different situation with other priorities. It's on the other side of the globe and just one more remote interest like Israel. It's there not to directly improve US's security, like Greenland does, but to suppress China's.


Even after migrating to ES modules, jQuery is still somewhat bloated. It is 27 kB (minified + gzipped) [0]. In comparison, Preact is only 4.7 kB [1].

[0]: https://bundlephobia.com/package/jquery@4.0.0

[1]: https://bundlephobia.com/package/preact@10.28.2


> Preact is only 4.7 kB

Is there some outlier place where people using virtual DOM frameworks don't also include 100-200kb of "ecosystem" in addition to the framework?

I suppose anything is possible, but I've never actually seen it. I have seen jQuery only sites. You get a lot for ~27kB.


I use Preact for a very lean build for a front-end that lives in a small embedded MCU flash ROM. Gziped the whole front-end is about 25KB, including SVG images baked-in to the preact gzip file. I'm very careful about the libraries I include and their impact on the overall payload size.

I had started with a simple front-end that was using jQuery to quickly prototype the device controls, but quickly exceeded my goal of keeping the front-end at under 40KB total gzipped. The problem is needing more than just jQuery, we also needed jQueryUI to help with the front-end, or build out similar complex components ourselves. And as soon as the jQuery code became non-trivial, it was clear that Preact made much more sense to use. Our payload is quite a bit smaller yhan the jQuery prototype was.


I do that when I need to make a simple SPA. Plain Vue plus a few tiny add-ons of my own.


Look at Deno + Fresh which is based on preact. You can do a lot with preact only


jQuery does a lot more though, and includes support older browsers.


Officially they state they only support 2 latest versions of chrome. But considering their support of IE11, that's actually a lot.


> includes support older browsers

Which is entirely the issue. Supporting a browser for the 10 users who will update jQuery in 2025 is insane.


Breaking backwards compatibility to turn 27kb into less because of "bloat" makes less sense to me.


It is definitely more than 10 users.


12


What is your plan after MinIO enters maintenance mode?


We're looking at alternatives, I've made some previous comments on that front. Sadly MinIO was the only option with sufficient performance for this particular situation. Thankfully we're not using any MinIO-specific features, so at least the migration path away is clear.


Ceph. The answer is always Ceph.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: