Hacker Newsnew | past | comments | ask | show | jobs | submit | yowlingcat's commentslogin

It's a very concerning future. I would love to live in a world where we could simply stop them from doing that, but for the moment, the best hedge appears to be the Chinese open weight models that can't be put back in the box and provide the valuable market function of commodifying the encoded knowledge of these models (which in and of itself was derived from knowledge not created by the frontier lab).

It goes the other way around as well. DeepSeek has made quite a few innovations that the US labs were lacking (DSA being the most notable one). It's also not clear to me how much of distilled outputs are just an additional ingredient of the recipe rather than a whole "frozen dinner" so to speak. I have no evidence to say it's one way or the other, but my guess is the former.

Does this support when you click on a peon a bunch of times and it says "Me not that kind of Orc!"

Just came across SST the other day and it looks very interesting. It looks like it's based on Pulumi, so it begs the question for me of why does it exist? Structurally it doesn't seem to be that different in capabilities. Perhaps it is more that it is a more opinionated subset that has better ergonomics. Is that correct, or is the reason different for you?

What I don't understand is why Gemini is not #1, other than that Google has no economic reason to have the same fire under its ass to be #1 as Anthropic and OpenAI. Or maybe they are correctly assessing that getting to good enough and out-building infrastructure is more valuable; they do have their TPUs as bets on their future and their search monopoly today to print nearly endless free cash flow. Perhaps Gemini is advancing at exactly the right rate for them.

I guess there is one thing that Gemini is objectively better at than either, which is long context, and it does seem to be by an order of magnitude. What boggles my mind is why Gemini is still not as good as the open weight frontier models yet. If they got just to parity with that along with their existing long context and strong token pricing, they'd be able to take over the coding market. Are they just biding their time to make their move? Hard to discern.


Something that's been on my mind recently - what if gen AI coding tools are ultimately attention casinos in the same way social media is? You burn through tons of tokens and you pay per token, it feels productive and engaging, but ultimately the more you try and fail, the more money the vendor makes. Their expressed (though perhaps not stated) economic goal may be to keep you in the "goldilocks zone" of making enough progress to not give up, but not so much progress that you 1-shot to the end state without issues.

I'm not saying that they can actually do that per sé; switching costs are so low that if you are doing worse than an existing competitor, you'd lose that volume. Nor am I saying they are deliberately bilking folks -- I think it would be hard to do that without folks cottoning on.

But, I did see an interesting thread on Twitter that had me pondering [1]. Basically, Claude Code experimented with RAG approaches over the simple iterative grep that they now use. The RAG approach was brittle and hard to get right in their words, and just brute forcing it with grep was easier to use effectively. But Cursor took the other approach to make semantic searching work for them, which made me wonder about the intrinsic token economics for both firms. Cursor is incentivized to minimize token usage to increase spread from their fixed seat pricing. But for Claude, iterative grep bloating token usage doesn't harm them and in fact increases gross tokens purchased, so there is no incentive to find a better approach.

I am sure there are many instances of this out there, but it does make me inclined to wonder if it will be economic incentives rather than technical limitations that eventually put an upper limit on closed weight LLM vendors like OpenAI and Claude. Too early to tell for now, IMO.

[1] https://x.com/antoine_chaffin/status/2018069651532787936


Well, the first time i got really excited about an LlM was when it told me “yes, if you give me your game ideas and we iterate together, i can handle 100% of the coding.” lies, pure lies.

I agree with your point and it is to that point I disagree with GP. These open weight models which have ultimately been constructed from so many thousands of years of humanity are also now freely available to all of humanity. To me that is the real marvel and a true gift.


I don't believe Antigravity or Cursor work well with pluggable models. It seems to be impossible with Antigravity and with Cursor while you can change the OAI compatible API endpoint to one of your choice, not all features may work as expected.

My recommendation would be to use other tools built to support pluggable model backends better. If you're looking for a Claude Code alternative, I've been liking OpenCode so far lately, and if you're looking for a Cursor alternative, I've heard great things about Roo/Cline/KiloCode although I personally still just use Continue out of habit.


Claude code router


It may be worth taking a look at LFM [1]. I haven't had the need to use it so far (running on Apple silicon on a day to day basis so my dailies are usually the 30B+ MoEs) but I've heard good things from the internet from folks using it as a daily on their phones. YMMV.

[1] https://huggingface.co/LiquidAI/LFM2.5-1.2B-Instruct


Great stuff and very timely. I just started getting into using opencode and while I'm hugely optimistic about its capabilities and can use it personally without too much sweat, I was left hoping for something a bit more batteries included to give to my non technical colleagues so we can collaborate together. This looks to be exactly what we were looking for so I am looking forward to giving it a spin!


Yeah ! I feel like until we figure out the correct UX for non-technical people the right way would be a sort of hybrid. Where you'd set it up on a remote server (if you know opencode you know openwork) and you just then have non-tech people do a one time setup to connect to the remote and from then on you can easily extend capabilities.


This is the approach that I've taken with Open WebUI. It's a great piece of software for exposing a shared GPT interface but of course that's pretty primitive in the grand scheme of things compared to something like this. But I completely agree with what you're suggesting and I think it's the only practical way to get a multi disciplinary team collaborating with this kind of a tool.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: