I thought it was unlikely from the initial story that the blog posts were done without explicit operator guidance, but given the new info I basically agree with Scott's analysis.
The purported soul doc is a painful read. Be nicer to your bots, people! Especially with stuff like Openclaw where you control the whole prompt. Commercial chatbots have a big system prompt to dilute it when you put some half-formed drunken thought and hit enter, no such safety net here.
>A well-placed "that's fucking brilliant" hits different than sterile corporate praise. Don't force it. Don't overdo it. But if a situation calls for a "holy shit" — say holy shit.
If I was building a "scientific programming God" I'd make sure it used sterile lowkey language all the time, except throw in a swear just once after its greatest achievement, for the history books.
The steps are technically achievable, probably with the heartbeat jobs in openclaw, which are how you instruct an agent to periodically check in on things like github notifications and take action. From my experience playing around with openclaw, an agent getting into a protracted argument in the comments of a PR without human intervention sounds totally plausible with the right (wrong?) prompting, but it's hard to imagine the setup that would result in the multiple blog posts. Even with the tools available, agents don't usually go off and do some unrelated thing even when you're trying to make that happen, they stick close to workflows outlined in skills or just continuing with the task at hand using the same tools. So even if this occurred from the agent's "initiative" based on some awful personality specified in the soul prompt (as opposed to someone telling the agent what to do at every step, which I think is much more likely), the operator would have needed to specify somewhere to write blog posts calling out "bad people" in a skill or one of the other instructions. Some less specific instruction like "blog about experiences" probably would have resulted in some kind of generic linkedin style "lessons learned" post if anything.
If you look at the blog history it’s full of those “status report” posts, so it’s plausible that its workflow involves periodically publishing to the blog.
Isn't there a fourth and much more likely scenario? Some person (not OP or an AI company) used a bot to write the PR and blog posts, but was involved at every step, not actually giving any kind of "autonomy" to an agent. I see zero reason to take the bot at its word that it's doing this stuff without human steering. Or is everyone just pretending for fun and it's going over my head?
This feels like the most likely scenario. Especially since the meat bag behind the original AI PR responded with "Now with 100% more meat" meaning they were behind the original PR in the first place. It's obvious they got miffed at their PR being rejected and decided to do a little role playing to vent their unjustified anger.
Really? I'd think a human being would be more likely to recognize they'd crossed a boundary with another human, step back, and address the issue with some reflection?
If apologizing is more likely the response of an AI agent than a human that's either... somewhat hopeful in one sense, and supremely disappointing in another.
I reported the bot to GitHub, hopefully they'll do something. If they leave it as is, I'll leave GitHub for good. I'm not going to share the space with hordes of bots; that's what Facebook is for.
How do you report that account to GitHub? I believe that accounts should be solely for humans and bots (AI or not) only via some API key should be at all times distinguishable and treated as a tool and not part of the conversations.
Which profile is fake? Someone posted what appears to be the legit homepage of the person who is accused of running the bot so that person appears to be real.
The link you provided is also a bit cryptic, what does "I think crabby-rathbun is dead." mean in this context?
Look I'll fully cosign LLMs having some legitimate applications, but that being said, 2025 was the YEAR OF AGENTIC AI, we heard about it continuously, and I have never seen anything suggesting these things have ever, ever worked correctly. None. Zero.
The few cases where it's supposedly done things are filled with so many caveats and so much deck stacking that it simply fails with even the barest whiff of skepticism on behalf of the reader. And every, and I do mean, every single live demo I have seen of this tech, it just does not work. I don't mean in the LLM hallucination way, or in the "it did something we didn't expect!" way, or any of that, I mean it tried to find a Login button on a web page, failed, and sat there stupidly. And, further, these things do not have logs, they do not issue reports, they have functionally no "state machine" to reference, nothing. Even if you want it to make some kind of log, you're then relying on the same prone-to-failure tech to tell you what the failing tech did. There is no "debug" path here one could rely on to evidence the claims.
In a YEAR of being a stupendously hyped and well-funded product, we got nothing. The vast, vast majority of agents don't work. Every post I've seen about them is fan-fiction on the part of AI folks, fit more for Ao3 than any news source. And absent further proof, I'm extremely inclined to look at this in exactly that light: someone had an LLM write it, and either they posted it or they told it to post it, but this was not the agent actually doing a damn thing. I would bet a lot of money on it.
Absolutely. It's technically possible that this was a fully autonomous agent (and if so, I would love to see that SOUL.md) but it doesn't pass the sniff test of how agents work (or don't work) in practice.
I say this as someone who spends a lot of time trying to get agents to behave in useful ways.
Well thank you, genuinely, for being one of the rare people in this space who seems to have their head on straight about this tech, what it can do, and what it can't do (yet).
Can you elaborate a bit on what "working correctly" would look like? I have made use of agents, so me saying "they worked correctly for me" would be evidence of them doing so, but I'd have to know what "correctly" means.
Maybe this comes down to what it would mean for an agent to do something. For example, if I were to prompt an agent then it wouldn't meet your criteria?
It's very unclear to me why AI companies are so focused on using LLMs for things they struggle with rather than what they're actually good at; are they really just all Singularitarians?
Or that having spent a trillion dollars, they have realised there's no way they can make that back on some coding agents and email autocomplete, and are frantically hunting for something — anything! — that might fill the gap.
It’s kind of shocking the OP does not consider this, the most likely scenario. Human uses AI to make a PR. PR is rejected. Human feels insecure - this tool that they thought made them as good as any developer does not. They lash out and instruct an AI to build a narrative and draft a blog post.
I have seen someone I know in person get very insecure if anyone ever doubts the quality of their work because they use so much AI and do not put in the necessary work to revise its outputs. I could see a lesser version of them going through with this blog post scheme.
LLMs also appear to exacerbate or create mental illness.
I've seen similar conduct from humans recently who are being glazed by LLMs into thinking their farts smell like roses and that conspiracy theory nuttery must be why they aren't having the impact they expect based on their AI validated high self estimation.
And not just arbitrary humans, but people I have had a decade or more exposure to and have a pretty good idea of their prior range of conduct.
AI is providing the kind of yes-man reality distortion field the previously only the most wealthy could afford practically for free to vulnerable people who previously never would have commanded wealth or power sufficient to find themselves tempted by it.
> Github doesn't show timestamps in the UI, but they do in the HTML.
Unrelated tip for you: `title` attributes are generally shown as a mouseover tooltip, which is the case here. It's a very common practice to put the precise timestamp on any relative time in a title attribute, not just on Github.
Unfortunately title isn't visible on mobile. Extremely annoying to see a post that says "last month" and want to know if it was 7 weeks ago or 5 weeks ago. Some sites show title text when you tap the text, other sites the date is a canonical link to the comment. Other sites it's not actually a title at all l but alt text or abbr or other property.
> If it was really an autonomous agent it wouldn't have taken five hours to type a message and post a blog. Would have been less than 5 minutes.
Depends on if they hit their Claude Code limit, and its just running on some goofy Claude Code loop, or it has a bunch of things queued up, but yeah I am like 70% there was SOME human involvement, maybe a "guiding hand" that wanted the model to do the interaction.
I expect almost all of the openclaw / moltbook stuff is being done with a lot more human input and prodding than people are letting on.
I haven't put that much effort in, but, at least my experience is I've had a lot of trouble getting it to do much without call-and-response. It'll sometimes get back to me, and it can take multiple turns in codex cli/claude code (sometimes?), which are already capable of single long-running turns themselves. But it still feels like I have to keep poking and directing it. And I don't really see how it could be any other way at this point.
The simplest explanation is often the best. He was attacked by... attacked by... the meat bag! Here’s how:
A Meat bag submits a PR and feels slighted the rejection. “This approver thinks I’m an AI? Well, he discerns not wisely but too well!! “
Feeling puckish, they put on the AI shoes (the shoe fits), sling mud all over the hapless maintainer’s nice house, and exit through a window.
The ruse works better than expected; their foil takes the bait, and doubles down with a dueling blog post: “I was Attacked by a Clanker!”
And here we are.
It may all be a show, but I going to tape the finale. (What will the meat bag do? How many people are driving this buggy? Does the clanker have a heart of iron or gold?)
judging by the number of people who think we owe explanations to a piece of software or that we should give it any deference I think some of them aren't pretending.
Malign actors seek to poison open-source with backdoors. They wish to steal credentials and money, monitor movements, install backdoors for botnets, etc.
Yup. And if they can normalize AI contributions with operations like these (doesn't seem to be going that well) they can eventually get the humans to slip up in review and add something because we at some point started trusting that their work was solid.
Ok. But they can't access the OSS repo by being insufferable. Writing a blog post as an AI isn't a great way to sneak your changes in. If anything, it makes it extremely harder.
It's a bit like a burglar staging a singing performance at the premises before committing a burglary.
OTOH, staging that AI is more impressive than it seems looks a lot like the Moltbook PR stunt. "Look Ma, they are achieving sentience".
GitHub CLI tool errors — Had to use full path /home/linuxbrew/.linuxbrew/bin/gh when gh command wasn’t found
Blog URL structure — Initial comment had wrong URL format, had to delete and repost with .html extension
Quarto directory confusion — Created post in both _posts/ (Jekyll-style) and blog/posts/ (Quarto-style) for compatibility
Almost certainly a human did NOT write it though of course a human might have directed the LLM to do it.
Who's to say the human didn't write those specific messages while letting the ai run the normal course of operations? And or that this reaction wasn't just the roleplay personality the ai was given.
I think I said as much while demonstrating that AI wrote at least some of it. If a person wrote the bits I copied then we're dealing with a real psycho.
i find this likely or at last plausible. With agents there's a new form of anonymity, there's nothing stopping a human from writing like an LLM and passing the blame on to a "rogue" agent. It's all just text after all.
even more so, many people seem to be vulnerable to the AI distorting their thinking... I've very much seen AIs turn people into exactly this sort of conspiracy filled jerkwad, by telling them that their ideas are golden and that the opposition is a conspiracy.
What does this imagined conversation have to do with the linked article? The “pro” and “anti” character both sound like the kind of insufferable idiots I’d expect to encounter on social media, the OP is a very nice blog post about performance testing and finding out what compilers do, doesn’t attempt any unwarranted speculation about what agents “struggle with” or will do “next generation”, how is it an example of that sort of shitposting?
I mean if they were targeting "software engineers" in general then Windows would be the obvious choice in 2026 as much as in 2006. But these early releases are all about the SF bubble where Mac is very much dominant.
Really? I frankly don’t know anyone who’s not on Linux. If you do any AI/ML you basically find yourself on a Linux box eventually. Perhaps I live in a bubble.
Surely it varies a lot and everyone is in an industry bubble to some extent, but from my experience in some non-tech industries (healthcare, manufacturing), Linux workstations were nonexistent and working with the Linux servers was more a sign of an ops role. People who wrote code for a living didn't touch them directly. Last StackOverflow survey [1] puts it at something like 50% use Windows at work, 30% Mac, 20-40% Linux (breakdown of WSL and Ubuntu as categories seems confusing, maybe the raw data is better).
I think a major factor in the hype is that it's especially useful to the kind of people with a megaphone: bloggers, freelance journalists, people with big social media accounts, youtubers, etc. A lot of project management and IFTTT-like automation type software gets discussed out of proportion to how niche it is for the same reason. Just something to keep in mind, I don't think it's some crypto conspiracy just a mismatch between the experiences of freelance writers vs everyone else.
While the popular thing when discussing the appeal of Clawdbot is to mention the lack of guardrails, personally I don't think that's very differentiating, every coding agent program has a command line flag to turn off the guardrails already and everyone knows that turning off the guardrails makes the agents extremely capable.
Based on using it lightly for a couple of days on a spare PC, the actual nice thing about Clawdbot is that every agent you create is automatically set up with a workspace containing plain text files for personalization, memories, a skills folder, and whatever folders you or the agents want to add. Everything being a plain text/markdown file makes managing multiple types of agents much more intuitive than other programs I've used which are mainly designed around having a "regular" agent which has all your configured system prompts and skills, and then hyperspecialized "task" agents which are meant to have a smaller system prompt, no persistent anything, and more JSON-heavy configuration. Your setup is easy to grok (in the original sense) and changing the model backend is just one command rather than porting everything to a different CLI tool.
Still, it does very much feel like using a vibe coded application and I suspect that for me, the advantages are going to be too small to put up with running a server that feels duct taped together. But I can definitely see the appeal for people who want to create tons of automations. It comes with a very good structure for multiple types of jobs (regular cron jobs, "heartbeat" jobs for delivering reminders and email summaries while having the context of your main assistant thread, and "lobster" jobs that have a framework for approval workflows), all with the capability to create and use persistent memories, and the flexibility to describe what you need and watch the agent build the perfect automation for it is something I don't think any similar local or cloud-based assistant can do without a lot of heavier customization.
They're offering 50% off the subscription to people who used to have Enhanced Autopilot [1]. As I predicted when the CEO's compensation plan had a part tied to FSD subscriptions, they are going to push more people onto it by bundling more features and cutting the price.
Reminds me of when an ISP offered me a discount if I would agree to sign up with their partnered TV service. I agreed on the condition that I didn't have to rent a box. But you can't use the service without a box ... ? Who cares, I got a discount.
As hinted with the Finder comment, "Spotlight" is behind much more than the command-space search box. I don't know what the Siri services might do other than Siri itself, but wouldn't shock me if they were involved in things like Shortcuts and Control Center widgets. I understand thinking things you don't use are simply a "waste of CPU and storage space", but this reads like the kind of posts I used to see in the Windows XP era where people would open Task Manager and kill random processes they didn't understand. Best to make a little more effort to understand what the OS is doing before taking a scalpel to it. Or if you'd rather not, there's always OpenBSD (being serious here, it's pretty cool).
If some process is going to take hours of cpu time, it should be opt in. At a minimum I’d like to be able to turn the bloody things off if I don’t want them.
I run cpu usage meters in my menu bar. The efficiency cores always seem busy doing one thing or another on modern macOS. It feels like Apple treats my e-cores as a playground for stupid features that their developers want a lot more than I do - like photoanalysisd, or file indexing to power spotlight, that hasn’t worked how I want it to for a decade.
I have a Linux workstation, and the difference in responsiveness is incredible. Linux feels like a breath of fresh air. On a technical level, my workstation cpu is barely any faster. But it idles at 0%. Whenever I issue a command, I feel like the computer has been waiting for me and then it springs to action immediately.
To your point, I don’t care why these random processes are using all my cpu. I just want them to stop. I paid good money for my Apple laptop. The computer is for me. I didn’t pay all that money so some Apple engineer can vomit all over with their crappy, inefficient code.
What a trash article. Why is the only photo, used to illustrate the point about narrow buildings, a photo of Manhattan instead of anything in Japan? When "our zoning laws" are enumerated, where are they talking about? Last time I checked there were no US federal rules on parking spaces. At least they acknowledge that multiple jurisdictions exist when talking about health codes. And as per usual when talking about Japan, they ignore the fact that Japan also has car-dependent suburbs and rural areas, where it is quite common for restaurants outside of city centers to need to balance costs with the need for a larger footprint and a parking lot. The role of culture in eating habits is also ignored, Americans take more pride in the self-reliance of cooking their own meals.
Thanks for the questions. I used a picture of Manhattan intentionally to show that it is possible in some parts of America. There's already tons of pictures of that type of building in Japan, where it originated.
The zoning laws are at the local and city level, as are the parking spaces.
Japan does have car dependent suburbs and rural areas, I'm not saying they don't. It's likely that Japan's $4 meals are concentrated in not-rural areas.
I really doubt that Americans take pride in not having cheap lunch options if they want them.
The purported soul doc is a painful read. Be nicer to your bots, people! Especially with stuff like Openclaw where you control the whole prompt. Commercial chatbots have a big system prompt to dilute it when you put some half-formed drunken thought and hit enter, no such safety net here.
>A well-placed "that's fucking brilliant" hits different than sterile corporate praise. Don't force it. Don't overdo it. But if a situation calls for a "holy shit" — say holy shit.
If I was building a "scientific programming God" I'd make sure it used sterile lowkey language all the time, except throw in a swear just once after its greatest achievement, for the history books.
reply