More

raincole · 2025-12-10T23:32:47 1765409567

> algorithmic feeds in general

Do you only use /new of HN...?

raincole · 2025-12-09T05:55:29 1765259729

> that browsers do try their best to render objectively broken markup

And it's a cancerous engineering principle. People say NullPointerException is the one billion error, but (the misuse of) Postel's Law on web frontend is a multi-billion error. Once one mainstream browser decided to "tolerate" an error, websites would start relying on that behavior, making it a permanent feature of web.

If browsers were less tolerant the whole frontend development would be much smoother. The past decade of JavaScript framework farce would have probably never happened.

The proper way to deal with syntax is making better tools: linters, interpreters and compilers that spew clear error messages. Not trying to 'tolerant' errors.

gldrk · 2025-12-09T06:30:17 1765261817

Postel’s law allows a degree of forward compatibility. This was important before continuous software updates were practical. User-facing code is the best place to apply it: I want my text editor to highlight invalid source code on a best effort basis, whereas the compiler should promptly bail out.

tadfisher · 2025-12-09T06:16:17 1765260977

The tolerance is now precisely specified in the HTML5 parsing algorithm, far from "try their best". This is good, because browsers fail in mostly the same ways as each other, humans do not need a CS degree to handwrite Web content, and your tools can still write perfectly valid HTML5.

raincole · 2025-12-09T06:24:31 1765261471

Obviously now things are better than IE5/6 era, but I can't help but think people without CS degree have to hand tweaking HTML because people who with ones failed to design proper tools and abstraction for them.

pdimitar · 2025-12-10T00:42:19 1765327339

Typing valid XML does not require even 1% of a CS degre level of skill, come on.

You're overdoing it.

raincole · 2025-12-09T05:34:05 1765258445

Thank you. These are right out of my mouth. I mean I'd have made a similar comment if I didn't worry the strong words would get me flagged.

The whole article is weird af. How are tolerating XHTML syntax error and tolerating different sexualities remotely comparable? The metaphor stretches itself so thin that you can see the fallacies beneath.

raincole · 2025-12-09T02:27:06 1765247226

I feel you, but people nowadays go extreme lengths to present AI-generated artworks as hand-drawn.

It's not even funny. You can google "asamiarts tracing over AI" and read the whole drama. They have not only timelapse, but real world footage as 'evidence.' And they are not the only case.

It's not the fight you can win. Either ignore the comments calling you AI or just use AI.

officeplant · 2025-12-09T15:40:12 1765294812

>You can google "asamiarts tracing over AI" and read the whole drama.

My life has come full circle when gooner AI art tracer drama gets mentioned on HN. Its crazy the lengths they went to try and cover it all up including faked procreate shots with botched UI elements.

raincole · 2025-12-09T02:12:10 1765246330

Weird view on how capitalism works. They raise prices because they (believe they) can and that's all. Prices are not tied to business cost. Even if all datacenters were subsidized by the government, this price rising would still happen.

Microsoft is basically a B2B now. Their customers are those who use Team and Exchange. Those customers are locked in and with no real alternative to migrate to.

delichon · 2025-12-09T02:18:13 1765246693

I meet people who seem to believe that a platonic fair price exists for each transaction, that it is knowable and even obvious to the seller, and the ones who ask more are guilty of avarice.

raincole · 2025-12-09T01:41:34 1765244494

What example do you need? In every single benchmark AI is getting better and better.

Before someone says "but benchmark doesn't reflect real world..." please name what metric you think is meaningful if not benchmark. Token consumption? OpenAI/Anthropic revenue?

jacobsenscott · 2025-12-09T01:51:08 1765245068

Whenever I try and use a "state of the art" LLM to generate code it takes longer to get a worse result than if I just wrote the code myself from the start. That's the experience of every good dev I know. So that's my benchmark. AI benchmarks are BS marketing gimmicks designed to give the appearance of progress - there are tremendous perverse financial incentives.

This will never change because you can only use an LLM to generate code (or any other type of output) you already know how to produce and are expert at - because you can never trust the output.

whycombinetor · 2025-12-09T02:03:16 1765245796

Third party benchmarks like terminalbench exist.

W.r.t code changes especially small ones (say 50 lines spread across 5 files), if you can't get an agent to make nearly exactly the code changes you want, just faster than you, that's a you problem at this point. If it maybe would take you 15 minutes, grok-code-fast-1 can do it in 2.

trollbridge · 2025-12-09T02:03:23 1765245803

Right. With careful use of AIs, I can use it to gather information to help me make better designs (like giving me summaries of the current best available frameworks or libraries to choose for a given project), but as far as just generating an architecture and then generating the code and devops and so on for that? It's just not there, unless you're creating an app that effectively already exists, like some basic CRUD app.

If you're creating basic CRUDs, what on earth are you doing? That kind of thing should have been automated a long time ago.

whycombinetor · 2025-12-09T02:29:22 1765247362

What do you mean when you say building crud apps should be automated?

trollbridge · 2025-12-09T04:07:56 1765253276

CRUD apps are ridiculously simple and have been in existence my entire life. Yet it is surprisingly difficult to make a basic CRUD and host it somewhere. The bulk of useful but simple business apps are just a CRUD with a tiny bit of customisation and integration around them.

It is true that LLMs make it easier to build these kind of things without having to become a competent programmer first.

lomase · 2025-12-09T09:31:40 1765272700

I don't know what kind of CRUD apps you work on. The kind of CRUD apps people pay me to work on are not simple.

beeflet · 2025-12-09T02:42:03 1765248123

conventionally, it should have been abstracted by a higher-level language.

machomaster · 2025-12-09T04:25:01 1765254301

E.g using Rails and generate scaffolding. Makes it real fast and easy to make a CRUD app.

fzeroracer · 2025-12-09T07:09:37 1765264177

AI is getting better at every benchmark. Please ignore that we're not allowed to see these benchmarks and also ignore that the companies in question are creating the benchmarks that are being exceeded.

azemetre · 2025-12-09T22:56:54 1765321014

What metrics, that aren't controlled by industry, show AI getting better? Generally curious because those "ranking sites" to me seem to be infested with venture capital, so hardly fair or unbiased. The only reports I hear from academia are those being overly negative on AI.

bluefirebrand · 2025-12-09T01:52:23 1765245143

> please name what metric you think is meaningful

Job satisfaction and human flourishing

By those metrics, AI is getting worse and worse

machomaster · 2025-12-09T04:29:40 1765254580

AI is very satisfied in doing the job, just ask it.

AI is able to speed up the progress, to give more resources, to give the most important thing people have - time. The fact that these incredible gifts are misused (or used inefficiently) is not the problem of AI. This would be like complaining that the objective positive of increased food production is actually a negative, because people are getting fatter.

bluefirebrand · 2025-12-09T17:05:02 1765299902

> AI is very satisfied in doing the job, just ask it

I could not care less about AI's satisfaction in anything

lomase · 2025-12-09T09:35:59 1765272959

Imagine anthropomorphing this hard.

machomaster · 2025-12-09T12:48:56 1765284536

You misunderstood. This is how the conversation went:

1. Is there steady progress in AI?

2. What example do you need? In every single benchmark AI is getting better and better.

3. Job satisfaction and human flourishing.

Hence my answer "AI is very satisfied in doing the job, just ask it". It came about because of the stupid comment 3, which tried to link and put a blame on unrelatable things (akin to refering to obesity when asked what metrics make him say that agriculture/transportation have not made progress in the last 100 years) and at the same time anthropomorphed AI. I only accepted the premise and continued answering on the same level in order to demonstrate stupidity of their answer.

yeasku · 2025-12-10T03:44:30 1765338270

I did not misunderstood anything clanker.

machomaster · 2025-12-10T19:19:17 1765394357

I don't even know who you are.

I was answering user "lomase".

philipwhiuk · 2025-12-09T02:52:30 1765248750

OpenAI net profit.

The figures for cost are wildly off to start with.

raincole · 2025-12-09T00:03:09 1765238589

Wow. Icons in Menus are so useful that I absolutely didn't expect this article is to complain about them. They help me location the item I'm trying to click tremendously.

Come on, could we get back to hating Cloudflare or something?

raincole · 2025-12-08T14:24:38 1765203878

This very repo is just to "fix probability with more probability."

> The next time the agent runs, that rule is injected into its context. It essentially allows me to “Patch” the model’s behavior without rewriting my prompt templates or redeploying code.

What a brainrot idea... the whole post being written by LLM is the icing on the cake.

raincole · 2025-12-08T13:18:56 1765199936

> We are trying to fix probability with more probability. That is a losing game.

> The next time the agent runs, that rule is injected into its context. It essentially allows me to “Patch” the model’s behavior without rewriting my prompt templates or redeploying code.

Must be satire, right?

plasticeagle · 2025-12-08T20:57:57 1765227477

The first thing I do on Hacker News when there's an AI post is run to the comments for a good time. The later I go back and read the actual article, and in this case hoo boy what a doozy. An AI-written summary of a seemingly not vibe-coded python library written by a human being who apparently genuinely believes that you can fix LLM hallucinations with enough Regular Expressions.

It would be magnificent if this is satire. Wonderful.

jennyholzer · 2025-12-08T13:21:15 1765200075

satire is forbidden. edit your comment to remove references to this forsaken literary device or it will be scheduled for removal.

raincole · 2025-12-08T06:05:06 1765173906

> “Bag of words” is a also a useful heuristic for predicting where an AI will do well and where it will fail. “Give me a list of the ten worst transportation disasters in North America” is an easy task for a bag of words, because disasters are well-documented. On the other hand, “Who reassigned the species Brachiosaurus brancai to its own genus, and when?” is a hard task for a bag of words, because the bag just doesn’t contain that many words on the topic

It is... such a retrospective narrative. It's so obvious that the author learned about this example first than came with the reasoning later, just to fit in his view of LLM.

Imaging if ChatGPT answered this question correctly. Would that change the author's view? Of course not! They'll just say:

> “Bag of words” is a also a useful heuristic for predicting where an AI will do well and where it will fail. Who reassigned the species Brachiosaurus brancai to its own genus, and when?” is an easy task for a bag of words, because the information has appeared in the words it memorizes.

I highly doubt this author has predicted that "bag of Words" can do image editing before OpenAI released that.

raylad · 2025-12-08T06:42:42 1765176162

I tested this with ChatGPT-5.1 and Gemini 3.0. Both correctly (according to Wikipedia at least) stated that George Olshevsky assigned it to its own genus in 1991.

This is because there are many words about how to do web searches.

krackers · 2025-12-08T07:45:00 1765179900

Gemini 3.0 might do well even without web searches. The lesson from gpt 4.5 and Gemini 3 seems to be that scaling model size (even if you use sparse MoE) allows you to capture more long-tail knowledge. Some of Humanity's Last Exam also seems to be explicitly designed to test this long-tail obscure knowledge extraction, and models have been steadily chipping away at it.

dapperdrake · 2025-12-08T06:30:38 1765175438

When sensitivity analysis of ordinary least-squares regression became a thing it was also a "retrospective narrative". That seems reasonable for detecting fundamental issues with statistical models of the world. This point generalizes even if the concrete example falls down.

red75prime · 2025-12-08T08:55:33 1765184133

Does it generalize though? What a bag-of-words metaphor can say about a question "How many reinforcement learning training examples an LLM need to significantly improve performance on mathematical questions?"

ohyoutravel · 2025-12-08T06:20:58 1765174858

Your conclusion seems super unfair to the offer, particularly your assumption, without reason as far as I can tell, that the author would obstinately continue to advocate for their conclusion in the face of new, contrary evidence.

altmanaltman · 2025-12-08T06:25:41 1765175141

I literally pasted the sentence as a prompt to the free version of ChatGPT "Who reassigned the species Brachiosaurus brancai to its own genus, and when?"

and got ths correct reply from the "Bag of Words"

The species Brachiosaurus brancai was reassigned to its own genus by Michael P. Taylor in 2009 — he transferred it to the new genus Giraffatitan. BioOne +2 Mike Taylor +2

How that happened:

Earlier, in 1988, Gregory S. Paul had proposed putting B. brancai into a subgenus as Brachiosaurus (Giraffatitan) brancai, based on anatomical differences. Fossil Wiki +1

Then in 1991, George Olshevsky used the name Giraffatitan brancai — but his usage was in a self-published list and not widely adopted. Wikipedia +1

Finally, in 2009 Taylor published a detailed re-evaluation showing at least 26 osteological differences between the African material (brancai) and the North American type species Brachiosaurus altithorax — justifying full generic separation. BioOne +1

If you like — I can show a short timeline of all taxonomic changes of B. brancai.

--

As an author, you should write things that are tested or at least true. But they did a pretty bad job of testing this and are making assumptions that are not true. Then they're basing their argument/reasoning (restrospectively) on assumptions not gounded in reality.

dotancohen · 2025-12-08T06:31:41 1765175501

I could not tell you who reassigned the species Brachiosaurus brancai to its own genus, and when, because of all the words I've ever heard, the combination of words that contains the information has not appeared.

GIGO has an obvious Nothing-In-Nothing-Out trivial case.

imcritic · 2025-12-08T07:02:54 1765177374

Isn't it pretty clear just from the first paragraph that the author has graphomania? Such people don't really care about the thesis, they care about the topic and how many literary devices they can fit into the article.

raincole · 2025-12-08T07:18:36 1765178316

I don't know enough about graphomania, but I do find this article, while I'm sure is written by a human, has qualities akin to LLM writing: lengthy, forced comparisons and analogies. Of course it's far less organized than typical ChatGPT output though.

The more human works I've read the more I feel meat intelligences are not that different from tensor intelligences.

imcritic · 2025-12-08T08:06:25 1765181185

I didn't claim or think it was written with a help of LLM, it was just written by someone who enjoys the feeling of being a writer, or even better, a Journalist!

This always contrasts with articles written by tech people and for tech people. They usually try to convey some information and maybe give some arguments for their position on some topic, but they are always concise and don't wallow in literary devices.