The LLM paradigm will never lead to AGI and to attach something other than AGI to all of your personal data and files — and setting it free whilst you sleep — is about as dumb as anything I can imagine.
The frontend will remain a requirement because you cannot trust LLMs to not hallucinate. Literally cannot. The "Claw" phenomenon is essentially a marketing craze for a headless AI browser that has filesystem access. I don't even trust my current browser with filesystem access. I don't trust the AI browsers when I can see what they're doing because they click faster than I can process what they're doing. If they're stopping to ask my permission, what's the point?
Mark my words, this will be an absolute disaster for every single person who connects these things to anything of meaning eventually.
Could a malicious claw sidechannel this by creating a localhost service and calling that with the signed micropayment, to get the decrypted contents of the wallet or anything?
Yeah, even for simple things, it's surprisingly hard to write a correct spec. Or more to the point, it's surprisingly easy to write an incorrect spec and think it's correct, even under scrutiny, and so it turns out that you've proved the wrong thing.
This isn't to say it's useless; sometimes it helps you think about the problem more concretely and document it using known standards. But I'm not super bullish on "proofs" being the thing that keeps AI in line. First, like I said, they're easy to specify incorrectly, and second, they become incredibly hard to prove beyond a certain level of complexity. But I'll be interested to watch the space evolve.
(Note I'm bullish on AI+Lean for math. It's just the "provably safe AI" or "provably correct PRs" that I'm more skeptical of).
>But I'm not super bullish on "proofs" being the thing that keeps AI in line.
But do we have anything that works better than some form of formal specification?
We have to tell the AI what to do and we have to check whether it has done that. The only way to achieve that is for a person who knows the full context of the business problem and feels a social/legal/moral obligation not to cheat to write a formal spec.
Code review, tests, a planning step to make sure it's approaching things the right way, enough experience to understand the right size problems to give it, metrics that can detect potential problems, etc. Same as with a junior engineer.
If you want something fully automated, then I think more investment in automating and improving these capabilities is the way to go. If you want something fully automated and 100% provably bug free, I just don't think that's ever going to be a reality.
Formal specs are cryptic beyond even a small level of complexity, so it's hard to tell if you're even proving the right thing. And proving that an implementation meets those specs blows up even faster, to the point that a lot of stuff ends up being formally unprovable. It's also extremely fragile: one line code change or a small refactor or optimization can completely invalidate hundreds of proofs. AI doesn't change any of that.
So that's why I'm not really bullish on that approach. Maybe there will be some very specific cases where it becomes useful, but for general business logic, I don't see it having useful impact.
I wonder how the internet would have been different if claws had existed beforehand.
I keep thinking something simpler like Gopher (an early 90's web protocol) might have been sufficient / optimal, with little need to evolve into HTML or REST since the agents might be better able to navigate step-by-step menus and questionnaires, rather than RPCs meant to support GUIs and apps, especially for LLMs with smaller contexts that couldn't reliably parse a whole API doc. I wonder if things will start heading more in that direction as user-side agents become the more common way to interact with things.
I would love to subscribe to / pay for service that are just APIs. Then have my agent organize them how I want.
Imagine youtube, gmail, hacker news, chase bank, whatsapp, the electric company all being just apis.
You can interact how you want. The agent can display the content the way you choose.
Incumbent companies will fight tooth and nail to avoid this future. Because it's a future without monopoly power. Users could more easily switch between services.
Tech would be less profitable but more valuable.
It's the future we can choose right now by making products that compete with this mindset.
Biggest question I have is maybe... just maybe... LLM's would have had sufficient intelligence to handle micropayments. Maybe we might not have gone down the mass advertising "you are the product" path?
Like, somehow I could tell my agent that I have a $20 a month budget for entertainment and a $50 a month budget for news, and it would just figure out how to negotiate with the nytimes and netflix and spotify (or what would have been their equivalent), which is fine. But would also be able to negotiate with an individual band who wants to directly sell their music, or a indie game that does not want to pay the Steam tax.
I don't know, just a "histories that might have been" thought.
If I can get videos from YouTube or Rumble or FloxyFlib or your mom’s personal server in her closet… I can search them all at once, the front end interface is my LLM or some personalized interface that excels in it’s transparency, that would definitely hurt Google’s brand.
I don't exactly mean APIs. (We largely have that with REST). I mean a Gopher-like protocol that's more menu based, and question-response based, than API-based.
Yesterday IMG tag history came up, prompting a memory lane wander. Reminding me that in 1992-ish, pre `www.foo` convention, I'd create DNS pairs, foo-www and foo-http. One for humans, and one to sling sexps.
I remember seeing the CGI (serve url from a script) proposal posted, and thinking it was so bad (eg url 256-ish character limit) that no one would use it, so I didn't need to worry about it. Oops. "Oh, here's a spec. Don't see another one. We'll implement the spec." says everyone. And "no one is serving long urls, so our browser needn't support them". So no big query urls during that flexible early period where practices were gelling. Regret.
Not the person you're responding to, but I think they mean sexps as in S-expressions [1]. These are used in all kinds of programming, and they have been used inside protocols for markup, as in the email protocol IMAP.
This sounds very plausible. Arguably MCPs are already a step in that direction: give the LLMs a way to use services that is text-based and easy for them. Agents that look at your screen and click on menus are a cool but clumsy and very expensive intermediate step.
When I use telegram to talk to the OpenClaw instance in my spare Mac I am already choosing a new interface, over whatever was built by the designers of the apps it is using. Why keep the human-facing version as is? Why not make an agent-first interface (which will not involve having to "see" windows), and make a validation interface for the human minder?
Any website could in theory provide api access. But websites do not want this in general: remember google search api? Agents will run into similar restrictions for some cases as apis. It is not a technical problem imo, but an incentives one.
The rules have changed though. They blocked api access because it helped competitors more than end users. With claws, end users are going to be the ones demanding it.
I think it means front-end will be a dead end in a year or two.
”End users” currently being people spending hundreds/thousands of dollars to set up custom brittle workflows, a whole total of a few thousands globally.
Let’s not make this into something it’s not, personally I lost all trust in karpathy with his hyping of Clawdbot as som sci-fi future when all it was were people prompting LLMs to go write Reddit posts.
That's literally not possible would be my take. But of course just intuition.
The dataset used to train LLM:s was scraped from an internet. The data was there mainly due to the user expansion due to www, and the telco infra laid during and after dot-com boom that enabled said users to access web in the first place.
The data labeling which underpins the actual training, done by masses of labour, on websites, could not have been scaled as massively and cheaply without www scaled globally with affordable telecoms infra.
Interesting to compare to 2008. At least here, I think we're building something? Whereas then, it was pure, unabashed, siphoning as much as possible out of the financial system from the average American into the pockets of a privileged, self-righteous few, followed by an immediate burning down and parachute out of the whole thing once the cracks started to form.
This reminds me of the vacuum substory in Mrs. Frisby and the Rats of NIMH, except vacuums replaced by AI.
Basically: nobody wants AI, but soon everyone needs AI to sort through all the garbage being generated by AI. Eventually you spend more time managing your AI that you have no time for anything else, your town has built extra power generators just to support all the AI, and your stuff is more disorganized before AI was ever invented.
I want to say this is even more true at the C-suite level. Great, you're all-in on racing to the lowest common denominator AI-generated most-likely-next-token as your corporate vision, and want your engineering teams to behave likewise.
At least this CEO gets it. Hopefully more will start to follow.
And the irony is it tries to make you feel like a genius while you're using it. No matter how dull your idea is, it's "absolutely the right next thing to be doing!"
you can prompt it to stop doing that, and to behave exactly how you need it. my prompts say "no flattery, no follow up questions, PhD level discourse, concise and succinct responses, include grounding, etc"
reply