Hacker Newsnew | past | comments | ask | show | jobs | submit | lelanthran's commentslogin

That sounds like a belief, but not necessarily a religious belief.

>> Over a thousand pull requests merged each week at Stripe are completely minion-produced, and while they’re human-reviewed, they contain no human-written code

> I pity the senior engineer, demoted from a helmsman into a human breakwater, tasked to stand steady against an ever-swelling sea of AI slop.

I'm skeptical that the human-in-the-loop, whose only task is to read code, is going to be able to review at the rate that the AI can produce.

It's Undefined Behaviour, now in every language.


"Lisp free", "Emacs-like".

Pick one. You can't claim to have both of those in the same editor.


To be fair the original EMACS did not have Lisp scripting, neither do implementations like MicroEmacs. Are those not Emacs-like?

Iteration only matters when the feedback is used to improve.

Your model doesn't improve. It can't.


The magic of test time inference is the harness can improve even if the model is static. Every task outcome informs the harness.

> The magic

Hilarious that you start with that as TAO requires

- Continuous adaptation makes it challenging to track performance changes and troubleshoot issues effectively.

- Advanced monitoring tools and sophisticated logging systems become essential to identify and address issues promptly.

- Adaptive models could inadvertently reinforce biases present in their initial training data or in ongoing feedback.

- Ethical oversight and regular audits are crucial to ensure fairness, transparency, and accountability.

Not much magic in there if it requires good old human oversight every step of the way, is there?


Goalposts wooshing by at maglev speed.

Of course it needs human supervision, see IBM 1979. Oversight however doesn’t mean the robots wait for approvals doing r&d and that’s where the magic is - the magic being robots overseeing their training and improvement of their harnesses.

IOW only the ethics and deployment decisions need to be gated by human decisions. The rest is just chugging along 1% a month, 1% a week, 1% a day…


Your model can absolutely improve

How would that work out barring a complete retraining or human in the loop evals?

> Good news. Now we need Chinese manufacturers of DDR4 chipsets and motherboards.

Search aliexpress for X99 dual socket motherboards.


> I think most big tech companies are like this and it's just going to get worse as AI adoption increases internally.

Welcome to UB, at scale, in every language.

Everyone loves to complain about C (and C++) UB; well, now, you have that in every language.

We're at the point now that my manually written (non-trivial) projects C hits fewer undefined behaviour than even trivial projects constructed with an LLM and human "review".

(I even wrote a blog post about it!)


What does UB refer to here?

Undefined Behavior

The arch nemesis of software engineering. The exceptionally exceptional exception. It doesn’t throw, it glides. It festers. It waits until production day. It rears its head from the dead. The demon with 1000 names…

The guy who orders minus 1.5 beers lol

Kind of ironic, using a UT in a comment about undefined behavior.

> If a new situation arises, you won't be able to make the same choice as they would.

They won't be able to, but they won't need to either - they can just continue cribbing off the original person, or if they are unable to continue cribbing off the same person, they'll find someone else to crib off.

The point is, for all these people outsourcing their thinking, they will always have someone to crib off.


> But will he learn to read?

Of course he will, just not well. The point of the GP is that he doesn't need to learn anything because the AI can understand his verbal instructions.


> my 7 year old is now able to nerd out and create games using claude even though he's just barely learned to read

Humans learn mastery by doing, not by watching.

I suppose it comes down to whether the most important skill for your kid is to give instructions, or whether it is to actually read and write.

For reference, my kid only just turned 6, and is at the level of reading books without pictures. I'm kinda proud that he reads better, faster and with more retention than kids aged 9, and it didn't come with the ease[1] that "nerding out" on Claude came to your kid.

The question you gotta ask yourself is this: is a skill that takes a 7 year old a day to master really going to make him more valuable than a skill that took a 6 year old 2.5 years to master?

The 6yo who can read can easily do what your kid did, but your kid can't easily do what the 6yo can.

From another PoV: how valuable of a skill do you think "prompting" is when a 7yo who hasn't mastered reading can master it?

--------------------

[1] I started a daily routine when he was 3.5 with the DISTAR alphabet. We did the routine every day, whether it was christmas, or his birthday, even on vacation. Same time, every day.


> Sure, but you can read articles that predate LLMs which have the same so called tells.

Not with such a high frequency, though. We're looking at 1 tell per sentence!


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: