FTA > I've had Claude Code write an entire unit/integration test suite in a few ...

rkozik1989 · 2025-12-09T14:52:14 1765291934

The benefit of having a team of QA engineers create tests is their differing perspectives, so with LLMs being trained to act like affirmation engines you have to wonder how that impacts the test cases it creates. Its the problem of LLMs being miserable at critiques manifesting itself in a different way.

However, in saying that, I am by no means an AI hater, but rather I just want models to be better than they currently are. I am tired of the tech demos and benchmark stats that don't really mean much aside from impressing someone who's not in a critical thinking mindset.

klysm · 2025-12-09T16:16:22 1765296982

Very similar experience here. I have not once managed to get an LLM to generate good tests, even for very simple code. It generally writes tautologies that will pass with high confidence.

kllrnohj · 2025-12-09T14:38:25 1765291105

Anecdotes etc etc but the AI tests I've been sent to review have been absolute shit. Stuff like that just calling a function doesn't crash the program. No assertions other than "end of test method reached"

Yes sometimes those tests are necessary, but it seemed to just do it everywhere because it made the code coverage percentage go up. Even though it was useless.

I have also had great experiences with AI cranking out straightforward boilerplate or asking C++ template metaprogramming questions. It's not all negative. But net-net it feels like it takes more work in total to use AI as you have to learn to recognize when it just won't handle the task, which can happen a lot. And you need to keep up with what it did enough to be able to take over. And reading code is harder than writing it.

piperswe · 2025-12-09T16:00:27 1765296027

I’ve seen agents produce plenty of those tests, but recently I’ve seen them generate some actually decent unit tests that I wouldn’t have thought of myself. It’s a bit of a crapshoot

insane_dreamer · 2025-12-10T05:46:55 1765345615

Experience from 2 days ago:

I had CC write a bunch of tests to make sure some refactoring didn't break anything, and then I ran the app and it crashed out of the gate. Why? Because despite the verbosity of the tests it turns out that it had mocked the most import parts to test, so the _actual_ connections weren't being tested, and while CC was happy to claim victory with all tests green, the app was broken.

jf22 · 2025-12-09T15:00:09 1765292409

> you've accepted a big drop in quality.

Right, but you do it in a 10th of the time.

WesleyJohnson · 2025-12-09T15:28:24 1765294104

So you're openly saying you're fine with quantity over quality.... in software engineering? That's fine for a MVP, maybe, but nothing beyond on that IMHO unless they're throw away scripts.

"Houston, we have a problem."

"Yeah, but we did it in a 10th of the time"

bluesnowmonkey · 2025-12-09T20:52:19 1765313539

Of course it's fine for any project.

There is exactly one "best" programmer in the world, and at this moment he/she is working on at most one project. Every other project in the world is accepting less than the "best" possible quality. Yes... in software engineering.

As soon as you sat down at the keyboard this morning, your employer accepted a sacrifice in quality for the sake of quantity. So did mine. Because neither one of us is the best. They could have hired someone better but they hired you and they're fine with that. They'd rather have the code you produce today than not have it.

It's the same for an AI. It could produce some code for you, right now, for nearly free. Would you rather have that code or not have it? It depends on the situation, yeah not always but sometimes it's worth having.

WesleyJohnson · 2025-12-10T17:51:48 1765389108

I didn't intend to imply "best" even in the scope of a team, let alone every software engineer in the world. But, I understand your point and it's fair.

TemptedMuse · 2025-12-09T16:09:32 1765296572

Here is the thing, most software engineers are not designing rockets, they are making basic CRUD apps. If there is a minor defect it can be caught and corrected without much issue. Our jobs are a lot less "critical infrastructure" than a lot of software engineers will allow their egos to accept.

Sure if you are making some medical surgery robot do it right, but if you are making a website the recommends wine pairings who cares if one of the buttons has a weird animation bug that doesn't even get noticed for a couple of years.

dlisboa · 2025-12-09T16:19:48 1765297188

I think I'm "most" engineers and I haven't ever worked on something that was "just" a CRUD app. Having a DB behind your web app doesn't make it "just" a CRUD.

It's really overestimated how many simple apps exist.

jf22 · 2025-12-09T16:47:58 1765298878

What kind of apps do you work on?

dlisboa · 2025-12-09T16:53:28 1765299208

Regular SaaS products of different kinds, cloud software, hosting software, etc. Really representative of most of the Web-enabled software out there.

For every one of them there has been an almost negligible amount of CRUD code, the meat of every one of those apps was very specific business logic. Some were also heavy on the frontend with equal amount of complexity on the backend. As a senior/staff level engineer you also have dive into other things like platform enablement, internal tooling, background jobs and data wrangling, distributed architectures, etc. which are even farther from CRUD.

jf22 · 2025-12-09T17:14:46 1765300486

A fancy CRUD app is still a CRUD app.

dlisboa · 2025-12-09T17:35:39 1765301739

Yes, like a guided missile is a fancy firecracker.

TemptedMuse · 2025-12-09T17:51:49 1765302709

Not to call you out but this is exactly what I meant when I said software engineers have egos that will not let them accept that they are not designing critical stuff.

Comparing your cloud based CRUD app to a missile is a perfect illustration. There is no dishonor in admitting that our stuff isn't going to kill anyone if there is a bug. Don't write bad code, but also sometimes just getting something out the door is much better than perfect quality (bird in the hand and all that).

instig007 · 2025-12-09T18:49:51 1765306191

> software engineers have egos that will not let them accept that they are not designing critical stuff

> Don't write bad code, but also sometimes just getting something out the door is much better than perfect quality (bird in the hand and all that).

Your bank account can be represented as a CR app, it's two letters short of CRUD, but it doesn't make it simple or simpler in any sense of the words.

Now the question: how much are you tolerant to bugs in your bank account? How often can they happen before you complain?

TemptedMuse · 2025-12-09T19:02:00 1765306920

Banking software is critical, but guess what, most software engineers are not writing banking software. I never said no software engineers write critical code. Heck I'd argue most at some point in their career will write something that needs to be as bug free as possible... at some point in their careers.

My point is that for most software engineering getting a product out is more important that a super high quality bar that slows everything down.

If you are writing banking software or flight control systems please do it with care, if you are making some React based recipe website or something I don't really care (99% of software engineering falls into this latter category in my opinion).

Software engineers need to get over themselves a bit, AI really exposed how many were just getting by making repetitive junk and thinking they were special.

instig007 · 2025-12-09T19:51:14 1765309874

> most software engineers are not writing banking software

Many software engineers write software for people who won't like the idea that their request/case can be ignored/failed/lost, when expressed openly on the front page of your business offering. Are bookings important enough? Are gifts for significant events important? Maybe you're okay with losing my code commits every once in a while, I don't know. And I'm not sure why you think it's okay to spread this bad management idea of "not valuable or critical enough" among engineers who should know better and who should keep sources of bad ideas at bay when it comes to software quality in general.

dlisboa · 2025-12-09T18:00:51 1765303251

Not to call you out either but it seems you have really no idea what a basic CRUD app is. Which is fine, I guess not everyone likes to reads the base definitions of these things. It's clear I replied to the wrong person as we don't have a shared understanding of complexity.

TemptedMuse · 2025-12-09T17:44:55 1765302295

That is just CRUD with buzzword soup around it.

jf22 · 2025-12-09T16:47:18 1765298838

> quantity over quality

Yes? Making quality concessions for more code or features is part of the job.

bagacrap · 2025-12-09T16:21:30 1765297290

I mean, just say you view unit testing as nothing more than a checkbox.

jf22 · 2025-12-09T16:48:46 1765298926

I don't know why you'd think that.

200 decent unit tests are better than zero unit tests.

bagacrap · 2025-12-12T15:58:55 1765555135

I don't think zero unit tests is the right answer either. And if you actually take the time to read all 300 and cull the useless or overlapping ones, you've invested much more than 10% of the time it would have taken you.

Having a zillion unit tests (of questionable quality) is a huge pita when you try to refactor.

When I am writing unit tests (or other tests), I'm thinking about all the time I'll save by catching bugs early -- either as I write the test or in the future as regressions crop up. So to place too much importance on the amount of time invested now is missing the point, and makes me think that person is just going through the motions. Of course if I'm writing throwaway code or a POC, I'll probably skip writing tests at all.

In order to add coverage for scenarios that I haven't even thought of, I prefer fuzz testing. Then I get a lot more than 2-300 tests and I don't even pretend to spend time reviewing the tests until they fail.

If you want to use an LLM to help expedite the typing of tests you have thought of, fine. If you just tell it to write the suite for itself, that's equivalent to hiring a (mediocre to bad) new grad and forcing them to write tests for you. If that's as good of an outcome as doing it yourself, I can only assume you are brand new to software engineering.

welshwelsh · 2025-12-09T19:54:20 1765310060

The main benefit of writing tests is that is forces the developer to think about what they just wrote and what it is supposed to do. I often will find bugs while writing tests.

I've worked on projects with 2,000+ unit tests that are essentially useless, often fail when nothing is wrong, and rarely detect actual bugs. It is absolutely worse than having 0 tests. This is common when developers write tests to satisfy code coverage metrics, instead of in an effort to make sure their code works properly.

jf22 · 2025-12-09T21:11:17 1765314677

Look, you tell the LLMs what kind of tests you want and judge the quality before committing.

If you're letting the LLM create useless test that's on you.

I think you're reading these comments in bad faith as if I'm letting the LLM add slop to satisfy a metric.

No, I'm using an LLM to write good tests that I will personally approve as usefull, and other people will review too, before merging into master.

a_rana · 2025-12-09T18:54:17 1765306457

Hard disagree! Review is still an important part but more and more, I find myself agreeing with the LLM generated changes ~90% of the time.

stocksinsmocks · 2025-12-09T16:17:35 1765297055

Which is the better value:

Hundreds of tests that were written basically for free in a few minutes even though a lot of them are kind of dumb?

Or hundreds of tests that were written for a five figure sum that took weeks or months, and only some of them are kind of dumb?

If you’re just thinking of code as the end in and of itself, then of course, the handcrafted artisanal product is better. If you think of code like an owner, an incidental expense towards solving a problem that has value, then cheap and disposable wins every time. We can throw our hands up about “quality“ and all that, but that baby was thrown out with the bathwater a very, very long time ago. The modern Web is slower than the older web. Desktop applications are just web browsers. Enterprise software barely works. Windows 11 happened. I don’t think anybody even bothers to scrutinize their dependency chains except for, I don’t know, like maybe missile guidance or something. And I just want to say Claude is not responsible for any of this. You humans are.

welshwelsh · 2025-12-09T20:00:23 1765310423

Neither. Tests should be written by developers only when it saves them time. The cost of writing them should be negative.

Instead of writing hundreds of useless tests so that the code coverage report shows high numbers, it is better to write a couple dozen tests based on business needs and code complexity.

stocksinsmocks · 2025-12-09T22:22:38 1765318958

Having used Bentley software products I can tell you with complete certainty that professional software developers have extremely bad judgment when it comes to the need to test software and verify its functionality. Developers just think they know what they’re doing because there’s typically not a strong feedback mechanism that inflicts serious career damage when they do things that are extremely lazy or stupid or unethical. How many people lost their job or had to change their name and live out the rest of their days in Juarez Mexico over AWS’ incomprehensible configuration causing an internet brown out? Anyone? A teenager serves cold onion rings at a burger joint and he’s on the street. Some lazy dweeb at Amazon blows up the internet and - come on, isn’t it about the friends we made along the way? It’s obscene and the lack of professionalism and accountability is a total disgrace.