Hacker Newsnew | past | comments | ask | show | jobs | submit | james_marks's commentslogin

This is 100% the new bottleneck. We’re going to see a lot agentic QA, E:E testing, etc soon for this reason.

Because frameworks don’t have bugs? Or unpredictable dependency interactions?

This is generous, to the say the least.


Well maintained, popular frameworks have github issues that frequently get resolved with newly patched versions of the framework. Sometimes bugs get fixed that you didn't even run into yet so everybody benefits.

Will your bespoke LLM code have that? Every issue will actually be an issue in production experienced by your customers, that will have to be identified (better have good logging and instrumentation), and fixed in your codebase.


Frameworks that are (relatively) buggy and slow to address bugs lose popularity, to the point that people will spontaneously create alternatives. This happened too many times.

Basically, I want to set boundaries in a healthy frame of mind, and have that default respected when my self control is lower because I’m tired, depressed, bored, etc.

“The algorithm” of social media is the opposite.


I think your reply has me convinced. You really can’t expect to have such self control all of the time. Damn.

Yep. First thing I do for this kind thing is make a preview=true flag so I don’t accidentally run destructive actions.

Now I like that idea as an environment variable that takes precedence over the command parameters.

The point is, if many 4-way stops don’t have traffic at them, a stop/start becomes a perfunctory, dangerous habit.

This is why the quality of my code has improved since using AI.

I can iterate on entire approaches in the same amount of time it would have taken to explore a single concept before.

But AI is an amplifier of human intent- I want a code base that’s maintainable, scalable, etc., and that’s a different than YOLO vibe coding. Vibe engineering, maybe.


My core uses are 100% racing the model in yolo mode to find a bug. I win most of the time but occasionally it surprises me.

Then also switching arch approaches quickly when i find some code strategies that are not correctly ergonomic. Splitting of behaviors and other refactors are much lower cost now.


Even as a technical person, when I wanted to play with running models locally, LM Studio turned it into a couple of button clicks.

Without much background, you’re finding models, chatting with them, have an OpenAI-compatible API w/logging. Haven’t seen the new version, but LM Studio was already pretty great.


When this happens to me it makes me question my design.

If the AI doesn’t understand it, chances are it’s counter-intuitive. Of course not all LLM’s are equal, etc, etc.


Then again, I have a rough idea on how I could implement this check with some (language-dependent) accuracy in a linter. With LLM's I... just hope and pray?

I'd agree with that but in the JS world, there's a lot of questionable library designs that are outside of my control.

I like this, but would note that each of this is effectively nagging you to do something.

I wonder if the real unlock is moving the task forward in some way. “I know you were interested in X, and the research approach petered out, here and some new approaches we could try:”

“You’ve got two kids’ birthdays next week, shall I order some legos?”


I've started using Claude code to review my linear tasks, add / propose new tags/labels and flag if it's a programming task (and if so flesh out requirements so I can toss it to an agent). It really helps me to just toss everything into it and see what I've got.

I'm actually going to take it further and use clawd to check Jira, linear, slack, and Apple reminders and help me to unify and aggregate them - as I'll often remember and record a reminder on Siri - and kind of ping me about these and adjusting dates when they're overdue so nothing slips through too past due


You now own that project with all of the undiscovered bugs, missing features, and long term maintenance.

Perhaps that’s a worthwhile trade, but you’re still bearing the cost in a different form.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: