More

lacker · 2026-02-12T21:53:43 1770933223

It seems tough culturally.

If you look at it from an outside point of view, right now Tesla is worth $1.6T, Waymo is worth $130B, and GM is worth $72B. If Cruise were actually a third viable competitor in this race, it would probably be worth more than the rest of GM. Self-driving is just a far more valuable business than car-making.

So from that point of view it would make sense to say, don't worry about the rest of GM too much, you should be willing to sacrifice all of that to increase the changes of making Cruise work.

It's hard to change the culture at a place like GM though. Does the GM CEO really want to take a huge amount of risk? Would they be willing to take a 50-50 shot where they either 10x the company's value or lose it all? Or would they prefer to pay a few billion dollars to avoid that risk.

Alive-in-2025 · 2026-02-12T23:05:01 1770937501

Using tesla valuation is not useful. It's a meme stock, has AI bs overvaluation over it. It's value is completely unconnected from reality. The car business is declining steadily. It's a good day when the famous CEO doesn't do something incredibly destructive to the brand name. It's just going down.

At the same time, if Musk went away, the stock would crash back to reality but a non-idiot leader could just do impossible, crazy, hard stuff, like ... working on obvious new models and basic steady improvements.

Tesla PE is 398 today (after a drop). Toyota's PE is 13. Toyota at the least is not hemoraging market share, sales, revenue, profits. Tesla is losing on all thoes things. Tesla would need a 30x price reduction to get down to much much more stable and profitable toyota. It's gets worse because Tesla's sales and profit keep going down each quarter.

There's no doubt value in self driving but the overall value is questionable. If there are many companies providing it, and at least waymo is doing great, plus there are many many other companies in China in good shape, the value multiple won't be there.

What's the market value of all taxi compannies combined in the us? It was about $230 billion in 2024 (https://www.skyquestt.com/report/taxi-market). Will tesla get 100% of the us self driving business in the future? No, waymo at least will be a serious market competitor, tesla's service doesn't really work.

Because there are going to be muiltiple competitors with working products (we'll see if/when tesla ever gets there), Tesla's huge valuation will never make sense. Robots are much farther behind than robotaxis (there's no brain, no prototype of a learning system, maybe one day).

This got way too long, I think GM just saw it as a money sink. I think that was a big mistake, though.

lacker · 2026-02-12T23:43:43 1770939823

It's funny to use "the market value of all taxi companies combined" as a proxy for how valuable the self-driving market will be, because that's exactly the reasoning that led people to underestimate Uber. The market value of all taxi companies combined was pretty small when Uber started.

That said, you could be right! Maybe self-driving will never be worth more than that. It's really hard to tell what business models will be like in the future. But this is the cultural mismatch, it seemed like GM leadership did not want to be in a risky business where they were betting billions of dollars on the success of self-driving. Clearly, to some people, that seemed like a really good bet to make. Time will tell.

Alive-in-2025 · 2026-02-14T06:18:08 1771049888

Of course a new genre defining company can do what existing companies did and vastly increase the market if they make it better, more efficient, easier whatever. All taxi companies are not all transportation needs, not even on the road. Busses, planes, etc. There will be really new market niches, how about if your RV lets you sleep comfortably and drives all night, you never need to pay or stop (if you can sleep in a moving vehicle), gas or charge itself up.

I was thinking it won't be just one company with this tech, so they'll compete and reduce the value of driverless cars down, by attacking the profit of each other. That would be healthier than having pseudo monopoly power because there are only 2 of them - like say ios and android are basically the world of cell phones, with a few very tiny other companies.

gruez · 2026-02-13T01:02:15 1770944535

>The market value of all taxi companies combined was pretty small when Uber started.

Were there even reliable metrics for this? They only seem small like car dealerships seem small - not of consolidation.

refulgentis · 2026-02-13T06:25:04 1770963904

Cosign, there's a reason it took forever and a day for Waymo to actually scale. It's great stuff, changed the way I live, but they gotta wince at the economics.

FarmerPotato · 2026-02-13T08:18:28 1770970708

Anecdote: living in an area where Waymo is becoming mainstream, but we've been seeing their mapping cars drive around streets for five years prior.

Eridrus · 2026-02-13T17:46:07 1771004767

Waymo has been attracting outside capital just fine.

I think the bigger issue is that Cruise was not succeeding at building the driver.

Cruise was shutdown after a safety incident, same as Uber.

mkozlows · 2026-02-13T19:26:00 1771010760

I feel like the bigger issue is that Cruise evidently had an unsafe company culture (like Uber): It wasn't just that they had an incident, it's that they lied about the incident and tried to cover things up.

This has been a pretty consistent pattern -- Cruise was always less transparent about its safety data than Waymo, and its claims tended to be opaque and non-measurable, whereas Waymo was partnering with insurance companies to get hard data.

Waymo is going to have incidents, too, but I think they have made the (correct) decision that being open and transparent about safety stuff is the way they move past those; Cruise made a decision in the opposite direction, and it killed them.

lacker · 2026-02-11T22:59:08 1770850748

In general electronics aren't recycled because people don't care about recycling them.

The easiest piece of electronics equipment to recycle is probably an iPhone. You can give an old iPhone to Apple and they will recycle it for free. But still most end-of-life iPhones are not recycled.

lacker · 2026-02-11T22:44:09 1770849849

It doesn't surprise me that Siri continues to be bad - Apple's current plan is to use a low-quality LLM to build a top-quality product, which turned out to be impossible.

What does surprise me is that Google Home is still so bad. They rolled out the new Gemini-based version, but if anything it's even worse than the old one. Same capabilities but more long-winded talking about them. It is still unable to answer basic questions like "what timer did you just cancel".

eitally · 2026-02-12T11:08:06 1770894486

From several engineer answers over a few years from inside Google, the consistent answer was that they created a highly fragmented ecosystem of devices over the years, almost none of which were capable of running the same software stack/versions, which led to an enormous mountain of technical debt and spaghetti code. There was a big effort a couple years ago to resolve this by creating a single new software version that would work on all modern devices and be supportable across future generations, but it also required (hah!) they essentially abandon (not brick, but just not really actively maintain or support) a plethora of older devices. So you have lots of consumers with either a mixed device environment where there's no consistency between their devices, or consumers who only have older devices that won't run the newer software and will be complaining about performance and reliability until they eventually give up and either abandon Google Home or buy a new device.

MrBuddyCasino · 2026-02-12T13:53:32 1770904412

Whats the problem? The LLM runs on a big fat server, the devices only have to call APIs, no?

paul7986 · 2026-02-12T05:11:58 1770873118

Indeed especially compared to chatGPT running so much better on my same iPhone where siri shits the bed. Voice transcription sucks in every aspects on my iPhone except surprise chatGPT gets what I am saying 90% of the time.

zmmmmm · 2026-02-12T07:47:40 1770882460

I can't even get gemini on my phone, configured as my assistant, to schedule a timer. It just googles the answer now or tells me "Gemini can't do that". 16 years ago it was doing that perfectly.

tonyedgecombe · 2026-02-12T08:40:22 1770885622

That's all I use Siri for. I'll be really cheesed off if this gets lost in the new version.

lacker · 2026-02-07T18:48:54 1770490134

It would be great if home assistants actually started to understand me personally. Not even in the sense of "what am I like", more like in the sense of, "when I ask what the weather is today, do I want a long lecture, or do I just want you to say high of 60 degrees, wear long sleeves".

lacker · 2026-01-31T02:12:07 1769825527

The problem with AI art is that it mostly sucks right now. Well, for "high art" - it can't write a novel, it doesn't create interesting artistic images. It's great for mocking up product UIs. And there are exceptions when an individual human puts a lot of work into it, for graphic art at least. Novels, it doesn't seem that close.

Yet.

I don't know if it will always stay this way, though. If one day I read a novel and I think, this is a great novel. I appreciated it, I felt myself growing from it. And then later I learn it was written by an AI. That's it, that will prove that great AI novels are possible. I will know it when I see it. I haven't seen it yet, but if it happens, I'll know.

So it's really just a technical question. Not a philosophical one.

jplusequalt · 2026-01-31T16:16:12 1769876172

>I don't know if it will always stay this way, though. If one day I read a novel and I think, this is a great novel. I appreciated it, I felt myself growing from it. And then later I learn it was written by an AI. That's it, that will prove that great AI novels are possible. I will know it when I see it. I haven't seen it yet, but if it happens, I'll know.

That's not what the essay is about. Sanderson spends the first half of the essay examining reasons for his strong feelings against AI. He also touches on the fact that he already struggles to discern generative AI from human art.

Eventually, he concludes that his real objection to generative AI has nothing to do with the quality, and everything to do with the process by which it was created. He believes (as do I) that focusing solely on the end product of generating a painting or a novel, robs would be artists of the valuable learning experience of failing repeatedly to create art, and then eventually rising past that failure to finish something. In this way, he thinks one of the real hallmarks of art is that it's transformative for the human who creates it, going so far to state that __humans are the art__ itself.

lubujackson · 2026-01-31T16:16:16 1769876176

As a writer and engineer, I don't see it.

Can AI kludge together a ripping story? Sure. But there is a reason people still write new books and buy new books - we crave the human connection and reflection of our current times and mores.

This isn't just a high art thing. My kids read completely different YA novels than I did, with just a few older canon titles persisting. I can hand them a book I loved as a kid and it just doesn't connect with them anymore.

How I think AI CAN produce art that people want is through careful human curation and guided generation. This is structurally the same as "human-in-the-loop" programming. We can connect to the artistry of the construction, in other words the human behind the LLM that influenced how the plot was structured, the characters developed and all the rest.

This is akin to a bad writer with a really good editor, or maybe the reverse. Either way, I think we will see a bunch of this and wring our hands because AI art is here, but I don't think we can ever take the human out of that equation. There needs to be a seed of "new" for us to give a shit.

jplusequalt · 2026-01-31T16:31:10 1769877070

Again, this article is not discussing the quality of generative AI. Sanderson clearly believes himself that AI is already able to produce things that are indiscernible to art from his eyes.

What this article is trying to get across is that art is a transformative process for the human who creates it, and by using LLMs to quickly generate results, robs the would be artist of the ability for that transformation to take place. Here's a quote from Sanderson:

"Why did I write White Sand Prime? It wasn’t to produce a book to sell. I knew at the time that I couldn’t write a book that was going to sell. It was for the satisfaction of having written a novel, feeling the accomplishment, and learning how to do it. I tell you right now, if you’ve never finished a project on this level, it’s one of the most sweet, beautiful, and transcendent moments. I was holding that manuscript, thinking to myself, “I did it. I did it."

lacker · 2026-01-31T02:02:40 1769824960

It's pretty common in California for cities to abuse the permitting process to extract money from homeowners. But on the other hand, these homeowners are getting subsidized by Prop 13. For a typical house in the Palisades bought 34 years ago ChatGPT estimates the subsidy is about $15,000/year. So, I have a little bit of sympathy but they're really on the benefitting end of California's various forms of tax craziness.

lacker · 2026-01-29T06:57:30 1769669850

If they actually worked right now, the demand would be high. Demand is certainly high for Waymos. Even if they worked worse than a Waymo I think the demand would still be very high. But it's hard to tell if (or when) it will work well enough to actually be a real product.

Deklomalo · 2026-01-29T10:49:53 1769683793

The question is what 'high' means in context of revenue.

Uber, the globally available taxi company, is valued 8 times less than tesla. If you are now able to kill all the costs for the taxi driving and reduce the cost for the car also, how much revenue is left?

Robotaxi has to be cheaper than a normal taxi to kill taxis. The margin of that company can't be that much more than a company like uber.

And uber itself will also invest in this, as every other car company. XPeng and co everyone who is building or working on this, will not just idly looking and waiting for tesla to just take 'whatever this cake' will look like.

For me it becomes a complet game changer if it becomes so reliable so extrem reliable, that i can order a car at night, a fresh bed / couch is then in the car and i can lie down while it drives me a few hundred kilometers away.

mustyoshi · 2026-01-29T14:24:48 1769696688

>Robotaxi has to be cheaper than a normal taxi to kill taxis. The margin of that company can't be that much more than a company like uber.

This just isn't true. If you're a woman, choosing a slightly more expensive robotaxi over a ride share where you might meet your end is a valid choice.

Deklomalo · 2026-01-29T14:30:30 1769697030

Robotaxis as a business has to be lower than a taxi as long as robotaxis are not known for being the better driver.

And woman have used taxis plenty of times especially because or for security. So I don't think your argument is very strong.

iamleppert · 2026-01-29T15:03:42 1769699022

At the end of the day, you're still trusting a misogynistic man to get you from point A to point B. One drives the car and works as a gig worker and wears a flannel shirt, and the other sits in an office at Waymo HQ, wears a patagonia vest. Both are still part of the patriarchy and have very little interest in making sure you're safe, unless there's money to be made.

542354234235 · 2026-01-29T16:40:04 1769704804

As much as I want to assume this is a trolling response, I'll pretend it is in good faith. The person you replied to is not speaking about nebulous dangers of "the patriarchy". They are talking about the risk of being verbally harassed, or physically/sexually assaulted by the driver during or directly after the ride.

https://www.wctv.tv/2026/01/14/rideshare-driver-arrested-aft...

https://www.wkrn.com/news/local-news/nashville/woman-shares-...

mdavid626 · 2026-01-30T08:24:35 1769761475

What the hell are you talking about?

apublicfrog · 2026-01-29T19:30:53 1769715053

> Robotaxi has to be cheaper than a normal taxi to kill taxis.

I'm not sure that's true. Self serve checkouts are killing the checkout. Washing machines killed the washing board. Something can be the same price or dearer if it's more convenient.

BadBadJellyBean · 2026-01-29T20:03:43 1769717023

That comparison has the problem that it is not comparable. A robo taxi is not much different from human taxi. I can not see much of an improvement for the rider. Whereas washing machines are an incredible time saver and self checkouts can be faster (especially if you use these little hand scanners).

1718627440 · 2026-02-01T21:31:54 1769981514

> self checkouts can be faster

Are you sure? They are great for reducing personnel cost for the shop operator, but a cashier scans so much faster than you do. If you want to optimize for speed, the human cashier would still be better.

vdm · 2026-01-29T15:00:26 1769698826

> has to be cheaper than a normal taxi

... plus 24/7 shifts of human drivers

LandoCalrissian · 2026-01-29T16:30:33 1769704233

Probably not a great strategy to piss off every blue voter in the country and then try to setup a business in cities.

riffraff · 2026-01-29T07:19:31 1769671171

that's why I said "Tesla's robotaxis".

They have not proven they are waymo level or near it, or that they will ever be there given the lack of lidar.

MetaWhirledPeas · 2026-01-29T15:45:53 1769701553

> Even if they worked worse than a Waymo I think the demand would still be very high.

They may already work better than a Waymo. It's hard to tell. It's certainly there using the public version of FSD. There's awkwardness, but the same can be said of Waymo. What I don't know is how many mandatory edge cases remain to be handled before they can set it free.

lacker · 2026-01-14T00:41:53 1768351313

In my experience, whenever you mandate open source software, you get software so unusable that it might as well be closed-source. Like, it doesn't compile, and they ignore all bug reports.

burnt-resistor · 2026-01-14T06:24:41 1768371881

That's a chicken-little, FUD argument because if it were required by law, then everyone would have to do it and be held to the same standard.

fithisux · 2026-01-14T07:16:15 1768374975

lacker · 2025-12-13T00:44:12 1765586652

I'm not sure if I have the right mental model for a "skill". It's basically a context-management tool? Like a skill is a brief description of something, and if the model decides it wants the skill based on that description, then it pulls in the rest of whatever amorphous stuff the skill has, scripts, documents, what have you. Is this the right way to think about it?

simonw · 2025-12-13T00:48:15 1765586895

It's a folder with a markdown file in it plus optional additional reference files and executable scripts.

The clever part is that the markdown file has a section in it like this: https://github.com/datasette/skill/blob/a63d8a2ddac9db8225ee...

  ---
  name: datasette-plugins
  description: "Writing Datasette plugins using Python and the pluggy plugin system. Use when Claude needs to: (1) Create a new Datasette plugin, (2) Implement plugin hooks like prepare_connection, register_routes, render_cell, etc., (3) Add custom SQL functions, (4) Create custom output renderers, (5) Add authentication or permissions logic, (6) Extend Datasette's UI with menus, actions, or templates, (7) Package a plugin for distribution on PyPI"
  ---

On startup Claude Code / Codex CLI etc scan all available skills folders and extract just those descriptions into the context. Then, if you ask them to do something that's covered by a skill, they read the rest of that markdown file on demand before going ahead with the task.

spike021 · 2025-12-13T05:27:33 1765603653

Apologies for not reading all of your blogs on this, but a follow-up question. Are models still prone to reading these and disregarding them even if they should be used for a task?

Reason I ask is because a while back I had similar sections in my CLAUDE.md and it would either acknowledge and not use or just ignore them sometimes. I'm assuming that's more of an issue of too much context and now skill-level files like this will reduce that effect?

jrecyclebin · 2025-12-13T05:48:37 1765604917

Skill descriptions get dumped in your system prompt - just like MCP tool definitions and agent descriptions before them. The more you have, the more the LLM will be unable to focus on any one piece of it. You don't want a bunch of irrelevant junk in there every time you prompt it.

Skills are nice because they offload all the detailed prompts to files that the LLM can ask for. It's getting even better with Anthropic's recent switchboard operator (tool search tool) that doesn't clutter the system prompt but tries to cut the tool list down to those the LLM will need.

ithkuil · 2025-12-13T10:32:47 1765621967

Can I organize skills hierarchically? If when many skills are defined, Claude Code loads all definitions into the prompt, potentially diluting its ability to identify relevant skills, I'd like a system where only broad skill group summaries load initially, with detailed descriptions loaded on-demand when Claude detects a matching skill group might be useful.

simonw · 2025-12-13T13:06:45 1765631205

There's a mechanism for that built into skills already: a skill folder can also include additional reference markdown files, and the skill can tell the coding agent to selectively read those extra files only when that information is needed on top of the skill.

There's an instruction about that in the Codex CLI skills prompt: https://simonwillison.net/2025/Dec/13/openai-codex-cli/

  If SKILL.md points to extra folders such as references/, load only the specific files needed for the request; don't bulk-load everything.

ithkuil · 2025-12-15T11:23:34 1765797814

yes but those are not quite new skills right?

can those markdown in the references also in turn tell the model to lazily load more references only if the model deems they are useful?

simonw · 2025-12-15T13:31:45 1765805505

Yes, using regular English prompting:

  If you need to write tests that mock
  an HTTP endpoint, also go ahead and
  read the pytest-mock-httpx.md file

greymalik · 2025-12-13T13:43:59 1765633439

> Anthropic's recent switchboard operator

I don’t know what this is and Google isn’t finding anything. Can you clarify?

Maxious · 2025-12-13T13:55:59 1765634159

https://platform.claude.com/docs/en/agents-and-tools/tool-us...

https://www.anthropic.com/engineering/advanced-tool-use talks more about the why

behnamoh · 2025-12-13T00:55:43 1765587343

why did this simple idea take so long to become available? I remember even in llama 2 days I was doing this stuff, and that model didn't even function call.

simonw · 2025-12-13T01:01:24 1765587684

Skills only work if you have a full blown code execution environment with a model that can run ls and cat and execute scripts and suchlike.

The models are really good at driving those environments now which makes skills the right idea at the right time.

jstummbillig · 2025-12-13T08:23:26 1765614206

Why do you need code execution envs? Could the skill not just be a function over a business process, do a then b then c?

steilpass · 2025-12-13T10:40:49 1765622449

Turns out that basic shell commands are a really powerful for context management. And you get tools which run in shells for free.

But yes. Other agent platforms will adopt this pattern.

true2octave · 2025-12-13T12:30:00 1765629000

I prefer to provide CLIs to my agent

I find it powerful how it can leverage and self-discover the best way to use a CLI and its parameters to achieve its goals

It feels more powerful than providing pre-defined set functions as MCP that will have less flexibility as a CLI

NiloCK · 2025-12-13T07:19:44 1765610384

I still don't really understand `skills` as ... anything? You said yourself that you've been doing this since llama 2 days - what do you mean by "become available"?

It is useful in a user-education sense to communicate that it's good to actively document useful procedures like this, and it is likely a performance / utilization boost that the models are tuned or prompt-steered toward discovering this stuff in a conventional location.

But honestly reading about skills mostly feels like reading:

> # LLM provider has adopted a new paradigm: prompts

> What's a prompt?

> You tell the LLM what you'd like to do, and it tries to do it. OR, you could ask the LLM a question and it will answer to the best of its ability.

Obviously I'm missing something.

baq · 2025-12-13T09:05:35 1765616735

It’s so simple there isn’t really more to understand. There’s a markdown doc with a summary/abstract section and a full manual section. Summary is always added to the context so the model is aware that there’s something potentially useful stored here and can look up details when it decides the moment is right. IOW it’s a context length management tool which every advanced LLM user had a version of (mine was prompt pieces for special occasions in Apple notes.)

kswzzl · 2025-12-13T02:30:32 1765593032

> On startup Claude Code / Codex CLI etc scan all available skills folders and extract just those descriptions into the context. Then, if you ask them to do something that's covered by a skill, they read the rest of that markdown file on demand before going ahead with the task.

Maybe I still don't understand the mechanics - this happens "on startup", every time a new conversation starts? Models go through the trouble of doing ls/cat/extraction of descriptions to bring into context? If so it's happening lightning fast and I somehow don't notice.

Why not just include those descriptions within some level of system prompt?

simonw · 2025-12-13T02:36:53 1765593413

Yes, it happens on startup of a fresh Claude Code / Codex CLI session. They effectively get pasted into the system prompt.

Reading a few dozen files takes on the order of a few ms. They add enough tokens per skill to fit the metadata description, so probably less than 100 for each skill.

raybb · 2025-12-13T04:21:40 1765599700

So when it says:

> The body can contain any Markdown; it is not injected into context.

It just means it's not injected into the context until the skill is used or it's never injected into the context?

https://github.com/openai/codex/blob/main/docs/skills.md

simonw · 2025-12-13T04:24:52 1765599892

Yeah, that means that the body of that file will not be injected into the context on startup.

I had thought that once the skill is selected the whole file would be read, but it looks like that's not the case: https://github.com/openai/codex/blob/ad7b9d63c326d5c92049abd...

  1) After deciding to use a skill, open its `SKILL.md`. Read only enough to follow the workflow.

So you could have a skill file that's thousands of lines long but if the first part of the file provides an outline Codex may stop reading at that point. Maybe you could have a skill that says "see migrations section further down if you need to alter the database table schema" or similar.

wahnfrieden · 2025-12-13T05:20:20 1765603220

Knowing Codex, I wonder if it might just search for text in the skill file and read around matches, instead of always reading a bit from the top first.

debugnik · 2025-12-13T09:13:59 1765617239

Can models actually stream the file in as they see fit, or is "read only enough" just an attention trick? I suspect the latter.

true2octave · 2025-12-13T12:31:01 1765629061

Depends the agent, they can read in chunks (i.e.: 500 lines at a time)

kridsdale1 · 2025-12-13T05:01:36 1765602096

So it’s a header file. In English.

throwaway314155 · 2025-12-13T01:15:54 1765588554

Do skills get access to the current context or are they a blank slate?

simonw · 2025-12-13T01:18:14 1765588694

They execute within the current context - it's more that the content of the skill gets added to that context when it is needed.

leetrout · 2025-12-13T00:56:53 1765587413

Have you used AWS bedrock? I assume these get pretty affordable with prompt caching...

prescriptivist · 2025-12-13T01:53:37 1765590817

Skills have a lot of uses, but one in particular I like is replacing one off MCP server usage. You can use (or write) an MCP server for you CI system and then add the instructions to your AGENTS.md to query the CI MCP for build results for the current branch. Then you need to find a way to distribute the MCP server so the rest of the team can use it or cook it into your dev environment setup. But all you really care about is one tool in the MCP server, the build result. Or...

You can hack together a shell, python, whatever script that fetches build results from your CI server, dumps them to stdout in a semi structured format like markdown, then add a 10-15 line SKILL.md and you have the same functionality -- the skill just executes the one-off script and reads the output. You package the skill with the script, usually in a directory in the project you are working on, but you can also distribute them as plugins (bundles) that claud code can install from a "repository", which can just be a private git repo.

It's a little UNIX-y in a way, little tools that pipe output to another tool and they are useful in a standalone context or in a chain of tools. Whereas MCP is a full blown RPC environment (that has it's uses, where appropriate).

wiether · 2025-12-13T06:09:58 1765606198

How do you manage the credentials to requests your CI server in this case? They are hardcoded in the script associated to your SKILL?

true2octave · 2025-12-13T12:33:13 1765629193

Credentials are tied to the service principal of the user

It’s straightforward for cloud services

delaminator · 2025-12-13T08:47:00 1765615620

Claude Code is not very good at “remembering” its skills.

Maybe they get compacted out of the context.

But you can call upon them manually. I often do something like “using your Image Manipulation skill, make the icons from image.png”

Or “use your web design skill to create a design for the front end”

Tbh i do like that.

I also get Claude to write its own skills. “Using what we learned about from this task, write a skill document called /whatever/using your writing skills skill”

I have a GitHub template including my skills and commands, if you want to see them.

https://github.com/lawless-m/claude-skills

jorl17 · 2025-12-13T14:06:45 1765634805

I'm so excited for the future, because _clearly_ our technology has loads to improve. Even if new models don't come out, the tooling we build upon them, and the way we use them, is sure to improve.

One particular way I can imagine this is with some sort of "multipass makeshift attention system" built on top of the mechanisms we have today. I think for sure we can store the available skills in one place and look only at the last part of the query, asking the model the question: "Given this small, self-contained bit of the conversation, do you think any of these skills is a prime candidate to be used?" or "Do you need a little bit more context to make that decision?". We then pass along that model's final answer as a suggestion to the actual model creating the answer. There is a delicate balance between "leading the model on" with imperfect information (because we cut the context), and actually "focusing it" on the task at hand, and the skill selection". Well, and, of course, there's the issue of time and cost.

I actually believe we will see several solutions make use of techniques such as this, where some model determines what the "big context" model should be focusing on as part of its larger context (in which it may get lost).

In many ways, this is similar to what modern agents already do. cursor doesn't keep files in the context: it constantly re-reads only the parts it believes are important. But I think it might be useful to keep the files in the context (so we don't make an egregious mistake) at the same time that we also find what parts of the context are more important and re-feed them to the model or highlight them somehow.

Sammi · 2025-12-13T10:07:47 1765620467

I'm kinda confused about why this even is something that we need an extra feature for when it's basically already built in to the agentic development feature. I just keep a folder of md files and I add whatever one is relevant when it's relevant. It's kinda straight forward to do...

Just like you I don't edit much in these files on my own. Mostly just ask the model to update an md file whenever I think we've figured out something new, so the learning sticks. I have files for test writing, backend route writing, db migration writing, frontend component writing etc. Whenever a section gets too big to live in agents.md it gets it's own file.

jorl17 · 2025-12-13T14:10:55 1765635055

Because the concept of skills is not tied to code development :) Of course if that's what you're talking about, you are already very close to the "interface" that skills are presented in, and they are obvious (and perhaps not so useful)

But think of your dad or grandma using a generic agent, and simply selecting that they want to have certain skills available to it. Don't even think of it as a chat interface. This is just some option that they set in their phone assistant app. Or, rather, it may be that they actually selected "Determine the best skills based on context", and the assistant has "skill packs" which it periodically determines it needs to enable based on key moments in the conversation or latest interactions.

These are all workarounds for the problems of learning, memory...and, ultimately, limited context. But they for sure will be extremely useful.

delaminator · 2025-12-13T18:36:05 1765650965

It’s a formalisation of the method, and it’s in your global ~/.claude and also per project.

I have mine in a GitHub template so I can even use them in Claude Code for the web. And synchronise them across my various machine (which is about 6 machines atm).

marwamc · 2025-12-13T04:13:06 1765599186

My understanding is this: A skill is made up of SKILL.md which is what tells claude how and when to use this skill. I'm a bit of a control freak so I'll usually explicitly direct claude to "load the wireframe-skill" and then do X.

Now SKILL.md can have references to more finegrained behaviors or capabilities of our skill. My skills generally tend to have a reference/{workflows,tools,standards,testing-guide,routing,api-integration}.md. These references are what then gets "progressively loaded" into the context.

Say I asked claude to use the wireframe-skill to create profileView mockup. While creating the wireframe, claude will need to figure out what API endpoints are available/relevant for the profileView and the response types etc. It's at this point that claude reads the references/api-integration.md file from the wireframe skill.

After a while I found I didn't like the progressive loading so I usually direct claude to load all references in the skill before proceeding - this usually takes up maybe 20k to 30k tokens, but the accuracy and precision (imagined or otherwise ha!) is worth it for my use cases.

kxrm · 2025-12-13T05:22:47 1765603367

> I'm a bit of a control freak so I'll usually explicitly direct claude to "load the wireframe-skill" and then do X.

You shouldn't do this, it's generally considered bad practice.

You should be optimizing your skill description. Often times if I am working with Claude Code and it doesn't load I skill, I ask it why it missed the skill. It will guide me to improving the skill description so that it is picked up properly next time.

This iteration on skill description has allowed skills to stay out of context until they are needed rather predictably for me so far.

adastra22 · 2025-12-13T08:01:03 1765612863

There are different ways to use the tool. If you chat with the model, you want it to naturally pick the right tool to use based on vibes and context so you don’t have to repeat yourself. If you are plugging a call it Claude code within a larger, structured workflow, you want the tool selection to be deterministic.

rane · 2025-12-13T12:30:04 1765629004

It's not enough. Sometimes skills just randomly won't be invoked.

chrisweekly · 2025-12-13T13:19:07 1765631947

My understanding is that use of "description" frontmatter is essential, bc Claude Code can read just the description without loading the entire file into context.

taytus · 2025-12-13T09:42:24 1765618944

Easy, let me try to explain: You want to achieve X, so you ask your AI companion, "How do I do X?" Your companion thinks and tries a couple of things, and they eventually work. So you say, "You know what, next time, instead of figuring it out, just do this"... that is a skill. A recipe for how to do things.

jmalicki · 2025-12-13T00:53:32 1765587212

Yes. I find these very useful for enforcing e.g. skills like debugging, committing code, make prs, responding to pr feedback from ai review agents, etc. without constantly polluting the context window.

So when it's time to commit, make sure you run these checks, write a good commit message, etc.

Debugging is especially useful since AI agents can often go off the rails and go into loops rewriting code - so it's in a skill I can push for "read the log messages. Inserting some more useful debug assertions to isolate the failure. Write some more unit tests that are more specific." Etc.

canadiantim · 2025-12-13T00:47:16 1765586836

I think it’s also important to think of skills in the context of tasks, so when you want an agent to perform a specialized task, then this is the context, the resources and scripts it needs to perform the task.

hadlock · 2025-12-13T07:42:25 1765611745

I'm excited to use this with the Ghidra cli mode to rapidly decompile physics engines from various games. Do I want my flight simulator to behave like the Cessna like in flight simulator 3.0 in the air? Codex can already do that. Do I want the plane to handle like Yoshi from Mario Kart 64 when taxiing? It hasn't been done yet but Claude code is apparently pretty good at pulling apart n64 roms so that seems within the realm of possibility.

lacker · 2025-12-12T23:39:49 1765582789

At first this sounds cool but I feel like it falls apart with a basic example.

Let's say you're running a simple e-commerce site. You have some microservices, like, a payments microservice, a push notifications microservice, and a logging microservice.

So what are the dependencies. You might want to send a push notification to a seller when they get a new payment, or if there's a dispute or something. You might want to log that too. And you might want to log whenever any chargeback occurs.

Okay, but now it is no longer a "polytree". You have a "triangle" of dependencies. Payment -> Push, Push -> Logs, Payment -> Logs.

These all just seem really basic, natural examples though. I don't even like microservices, but they make sense when you're essentially just wrapping an external API like push notifications or payments, or a single-purpose datastore like you often have for logging. Is it really a problem if a whole bunch of things depend on your logging microservice? That seems fine to me.

Alupis · 2025-12-12T23:53:28 1765583608

Is your example really a "triangle" though? If you have a broker/queue, and your services just push messages into the ether, there's no actual dependency going on between these services.

Nothing should really depend on your logging service. They should push messages onto a bus and forget about them... ie. aren't even aware of the logging service's existence.

seanhunter · 2025-12-13T11:15:48 1765624548

That example is still an undirected cycle so not a polytree and so, by the reasoning of the author of tfa not kosher for reasons they don’t really explain.

Honestly I think the author learned a bit of graph theory, thought polytrees are interesting and then here we are debating the resulting shower thought that has been turned into a blog post.

whstl · 2025-12-15T02:04:34 1765764274

That was my impression as well. There's pretty much no argument for why a DAG is worse than a polytree.

Scubabear68 · 2025-12-12T23:44:41 1765583081

I don’t understand why you would have a logging microservice vs just having a library that provides logging that is used wherever you need logging.

ericmcer · 2025-12-12T23:49:11 1765583351

Only good reason would be for bulk log searching, but a lot of cloud providers will already capture and aggregate and let you query logs, or there are good third party services that do this.

Pretty handy to search a debug_request_id or something and be able to see every log across all services related to a request.

yunwal · 2025-12-13T03:12:34 1765595554

> but a lot of cloud providers will already capture and aggregate and let you query logs

This is just the cloud provider taking the dependency on their logging service for you. It doesn’t change the shape of the graph.

gpm · 2025-12-13T01:36:39 1765589799

Logs need to go somewhere to be collected, viewed, etc. You might outsource that, but if you don't it's a service of it's own (probably actually a collection of microservices, ingestion, a web server to view them, etc)

Scubabear68 · 2025-12-13T15:48:01 1765640881

In my experience this is best done as an out of band flow in the background eg one of the zillion services that collect and aggregate logs.

sebastianconcpt · 2025-12-13T15:35:19 1765640119

The issue is that one of the services is the events hub for the rest to remain in loose coupling (observer pattern).

The criticality of Kafka or any event queue/streams is that all depend on it like fish on having the ocean there. But between fishes, they can stay acyclicly dependent.