More

feastingonslop · 2026-02-12T06:28:56 1770877736

Nor does it stop self-respecting LLMs.

feastingonslop · 2026-02-11T05:24:40 1770787480

LLMs write superior code. Maybe they learned from humans, but they have seen further while standing on those shoulders.

lelanthran · 2026-02-11T05:40:49 1770788449

> LLMs write superior code.

How would they know what superior code is? They're trained on all code. My expectation and experience has been that they write median code in the best-case scenario (small greenfields projects, deeply specified, etc).

yoyohello13 · 2026-02-11T06:07:32 1770790052

The AI maximalist are really out in force in this thread.

pinkgolem · 2026-02-11T05:27:21 1770787641

Then you have access to better models then I do (4.6/5.3)

The code is mostly not bad, but most programmers i have worked with write far better code.

feastingonslop · 2026-02-11T05:14:58 1770786898

If the LLM writes better code (and it will) then its code should be favored over a library.

mikert89 · 2026-02-11T05:25:42 1770787542

People aren’t mentally prepared for the LLMs spitting out flawless code

yoyohello13 · 2026-02-11T06:11:11 1770790271

Because it’s not a forgone conclusion that it’s going to happen in our lifetime.

feastingonslop · 2026-02-11T04:47:46 1770785266

Why would you not?

thesmtsolver2 · 2026-02-11T05:04:13 1770786253

Libraries are more well tested. Unless you "regurgitate code in a library" as "not importing any libraries".

feastingonslop · 2026-02-11T05:27:51 1770787671

What value is there in tested code when you have the option of superlative, flawless code?

kristjansson · 2026-02-11T05:54:53 1770789293

How do you determine flawlessness? How you even approximate a guarantee of it? To what specification is flawlessness judged, and can you precisely and completely relay that spec to your friendly local robot more efficiently than it can vendor / import existing libraries?

feastingonslop · 2026-02-11T04:12:47 1770783167

It doesn’t mess up. Not any more.

paulryanrogers · 2026-02-13T23:33:42 1771025622

Claude, ChatGPT, and Gemini still make mistakes IME. What are you using?

feastingonslop · 2026-02-09T07:50:47 1770623447

> Everyone can treat the project as a black box, and focus on observable changes to the project's behavior.

This is a good thing. I don’t need to focus on oil refineries when I fill my car with gas. I don’t know how to run a refinery, and don’t need to know.

locknitpicker · 2026-02-09T12:09:54 1770638994

> This is a good thing. I don’t need to focus on oil refineries when I fill my car with gas. I don’t know how to run a refinery, and don’t need to know.

No, it isn't. If you do not know what the project does, you are unable to even review the changes proposed by a coding agent.

Consequently, we already have a long track record of LLMs introducing vulnerabilities.

LLMs are great at speeding up small tasks but the trade-off is that developers become progressively clueless.

abenga · 2026-02-09T09:44:16 1770630256

Someone somewhere knows though.

feastingonslop · 2026-02-07T21:28:06 1770499686

The code itself does not matter. If the tests pass, and the tests are good, then who cares? AI will be maintaining the code.

nine_k · 2026-02-07T22:02:29 1770501749

Next iterations of models will have to deal with that code, and it would be harder and harder to fix bugs and introduce features without triggering or introducing more defects.

Biological evolution overcomes this by running thousands and millions of variations in parallel, and letting the more defective ones to crash and die. In software ecosystems, we can't afford such a luxury.

varispeed · 2026-02-07T22:20:23 1770502823

An example: it had a complete interface to a hash map. The task was to delete elements. Instead of using the hash map API, it iterated through the entire underlying array to remove a single entry. The expected solution was O(1), but it implemented O(n). These decisions compound. The software may technically work, but the user experience suffers.

feastingonslop · 2026-02-07T22:30:45 1770503445

If you have particular performance requirements like that, then include them. Test for them. You still don’t have to actually look at the code. Either the software meets expectations or it doesn’t, and keep having AI work at it until you’re satisfied.

varispeed · 2026-02-07T23:19:29 1770506369

How deep do you want to go? Because reasonable person wouldn't have expected to hand hold AI(ntelligence) to that level. Of course after pointing it out, it has corrected itself. But that involved looking at the code and knowing the code is poor. If you don't look at the code how would you know to state this requirement? Somehow you have to assess the level of intelligence you are dealing with.

feastingonslop · 2026-02-07T23:39:19 1770507559

Since the code does not matter, you wouldn’t need or want to phrase it in terms of algorithmic complexity. You surely would have a more real world requirement, like, if the data set has X elements then it should be processed within Y milliseconds. The AI is free to implement that however it likes.

sarchertech · 2026-02-08T01:03:09 1770512589

Even if you specify performance ranges for every individual operation, you can’t specify all possible interactions between operations.

If you don’t care about the code you’re not checking in the code, and every time you regenerate the code you’re going to get radically different system performance.

Say you have 2 operations that access some data and you specify that each can’t take more than 1ms. Independently they work fine, but when a user runs B then A immediately, there’s some cache thrashing that happens that causes them to both time out. But this only happens in some builds because sometimes your LLM uses a different algorithm.

This kind of thing can happen with normal human software development of course, but constantly shifting implementations that “no one cares about” are going to make stuff like this happen much more often.

There’s already plenty of non determinism and chaos in software, adding an extra layer of it is going to be a nightmare.

The same thing is true for every single implementation detail that isn’t in the spec. In a complex system even implementation details you don’t think you care about become important when they are constantly shifting.

flyinglizard · 2026-02-07T21:35:12 1770500112

That's assuming no human would ever go near the code, and that over time it's not getting out of hand (inference time, token limits are all a thing), and that anti-patterns don't get to where the code is a logical mess which produces bugs through a webbing of specific behaviors instead of proper architecture.

However I guess that at least some of that can be mitigated by distilling out a system description and then running agents again to refactor the entire thing.

sarchertech · 2026-02-07T21:48:40 1770500920

> However I guess that at least some of that can be mitigated by distilling out a system description and then running agents again to refactor the entire thing.

The problem with this is that the code is the spec. There are 1000 times more decisions made in the implementation details than are ever going to be recorded in a test suite or a spec.

The only way for that to work differently is if the spec is as complex as the code and at that level what’s the point.

With what you’re describing, every time you regenerate the whole thing you’re going to get different behavior, which is just madness.

flyinglizard · 2026-02-08T02:45:43 1770518743

You could argue that all the way down to machine code, but clearly at some point and in many cases, the abstraction in a language like Python and a heap of libraries is descriptive enough for you not to care what’s underneath.

sarchertech · 2026-02-08T03:55:31 1770522931

The difference is that what those languages compile to is much much more stable than what is produced by running a spec through an LLM.

Python or a library might change the implementation of a sorting algorithm once in a few years. An LLM is likely to do it every time you regenerate the code.

It’s not just a matter of non-determinism either, but about how chaotic LLMs are. Compilers can produce different machine code with slightly different inputs, but it’s nothing compared to how wildly different LLM output is with very small differences in input. Adding a single word to your spec file can cause the final code to be unrecognizably different.

feastingonslop · 2026-02-07T21:45:59 1770500759

And that is the right assumption. Why would any humans need (or even want) to look at code any more? That’s like saying you want to go manually inspect the oil refinery every time you fill your car up with gas. Absurd.

flyinglizard · 2026-02-08T02:48:23 1770518903

Cars may be built by robots but they are maintained by human technicians. They need a reasonable layout and a service manual. I can’t fathom (yet) having an important codebase - a significant piece of a company’s IP - that is shut off to engineers for auditing and maintenance.

vb-8448 · 2026-02-07T23:54:53 1770508493

Tests don't cover everything. Performance? Edge cases? Optimization of resource usage are not tipically covered by tests.

AstroBen · 2026-02-08T00:12:38 1770509558

Humans not caring about performance is so common we have Wirth's law

But now the clankers are coming for our jobs suddenly we're optimization specialists

sarchertech · 2026-02-08T00:40:52 1770511252

It’s not about optimizing for performance, it’s about non-deterministic performance between “compiler” runs.

The ideal that spec driven developers are pushing towards is that you’d check in the spec not the code. Anytime you need the code you’d just regenerate it. The problem is different models, different runs of the same model, and slightly different specs will produce radically different code.

It’s one thing when your program is slow, it’s something completely different when your program performance varies wildly between deployments.

This problem isn’t limited to performance, it’s every implicit implementation detail not captured in the spec. And it’s impossible to capture every implementation detail in the spec without the spec being as complex as the code.

AstroBen · 2026-02-08T00:59:37 1770512377

I made a very similar comment to this just today: https://news.ycombinator.com/item?id=46925036

I agree, and I didn't even fully consider "recompiling" would change important implementation details. Oh god

This seems like an impossible problem to solve? Either we specify every little detail, or AI reads our minds

sarchertech · 2026-02-08T01:28:03 1770514083

I don’t think it is possible to solve without AGI. I think LLMs can augment a lot of software development tasks, but we’ll still need to understand code until they can completely take over software engineering. Which I think requires an AI that can essentially take over any job.

feastingonslop · 2026-02-07T18:29:40 1770488980

Nobody is saying to skip testing the software. Testing is still important. What the code itself looks like, isn’t.

JackSlateur · 2026-02-07T23:42:32 1770507752

Knowing the inner working of a complex system is a hard requirement of its testing.

harrisi · 2026-02-08T10:00:32 1770544832

Testing is not a proof a software system is correct. Also, if tests are generated as well, there's no trust in how anything works or if the tests are covering important aspects.

feastingonslop · 2026-02-07T16:12:50 1770480770

And there was a time when using libraries and frameworks was the right thing to do, for that very reason. But LLMs have the equivalent of way more experience than any single programmer, and can generate just the bit of code that you actually need, without having to include the whole framework.

trescenzi · 2026-02-07T16:28:51 1770481731

As someone who’s built a lot of frontend frameworks this isn’t what I’ve found. Instead I’ve found that you end up with the middle ground choice which while effective is no better than the externally maintained library of choice. The reason to build your own framework is so it’s tailor suited to your use cases. The architecting required to do that LLMs can help with but you have to guide them and to guide them you need expertise.

plagiarist · 2026-02-07T16:33:48 1770482028

I would like a more reliable way to activate this "way more experience."

What I see in my own domain I often recognize as superficially working but flawed in various ways. I have to assume the domains I am less familiar are the same.

koverstreet · 2026-02-07T17:42:51 1770486171

Claude's a smart junior engineer who's read a lot of books but is lacking in real word experience.

It definitely eliminates a lot of tedium, but needs a lot of guidance if you want good results.

leecommamichael · 2026-02-07T16:46:01 1770482761

> can generate just the bit of code that you actually need

Design is the key. Codebases (libraries and frameworks not exempt,) have a designed uniformity to them. How does a beginner learn to do this sort of design? Can it be acquired completely by the programmer who uses LLMs to generate their code? Can it be beneficial to recognize opinionated design in the output of an LLM? How do you come to recognize opinion?

In my personal history, I've worked alongside many programmers who only ever used frameworks. They did not have coding design sensibilities deeper than a social populist definition of "best practice." They looked to someone else to define what they can or cannot do. What is right to do.

mnicky · 2026-02-07T16:19:42 1770481182

Critically, they will also enable faster future migration to a framework in case it proves useful.

feastingonslop · 2026-02-07T13:54:06 1770472446

I don’t understand the interest in “quality code.” I never need to look at the code itself. I just make sure it runs right.

acbart · 2026-02-07T13:55:16 1770472516

It makes it easier to make sure it runs right. Code that is easier to make sure is quality code. Code that is hard to make sure is not quality code.