Hacker Newsnew | past | comments | ask | show | jobs | submit | idonotknowwhy's commentslogin

Last year's models were bolder. Eg. Sonnet-3.7(thinking), 10 times got it right without hedging:

>You should drive your car to the car wash. Even though it's only 50 meters away (which is very close), you'll need your car physically present at the car wash to get it washed. If you walk there, you'll arrive without your car, which wouldn't accomplish your goal of getting it washed.

>You'll need to drive your car to the car wash. While 50 meters is a very short distance (just a minute's walk), you need your car to actually be at the car wash to get it washed. Walking there without your car wouldn't accomplish your goal!

etc. The reasoning never second-guesses it either.

A shame they're turning it of in 2 days.


Yeah, it re-sends all the agent system prompts.


Yes, it does exactly that. It also sends other prompts like generating 3 options to choose from, prefilling a reply like 'compile the code', etc. (I can confirm this because I connect CC to llama.cpp and use it with GLM-4.7. I see all these requests/prompts in the llama-server verbose log.)

You can stop most of this with

export DISABLE_NON_ESSENTIAL_MODEL_CALLS=1

And might as well disable telemetry, etc: export CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1

I also noticed every time you start CC, it sends off > 10k tokens preparing the different agents. So try not to close / re-open it too often.

source: https://code.claude.com/docs/en/settings


I would always close claude to start a new chat... Guess I should stop doing that. Thanks for bringing my attention to those two env vars.


Agreed. Not a single "this isn't just X, it's Y" in the entire article. Actually quite refreshing to read something written by a human for once.


Rehydrated version of what? And what does that mean?


100% agreed, and I've been explaining this to people for the past year.

I have an iPhone now and miss Firefox for Android (with Ublock, sponsorblock, etc). But this painful restriction is the only thing stopping Chrome from becoming the new IE6.

At a few startups I've worked for, the devs all use chrome exclusively, and only test in chrome during development.

The only reason they consider other browsers, is because of Safari on iOS. Sometimes it's driven by support calls / complains from iOS users after a release. If Chrome's engine is allowed on iOS, that means support can just tell the users to install Chrome (like they do now if anyone has issues on Windows in other browsers). This means Firefox will usually work as well.

Many years ago, I was able to swap banks when my bank's website stopped working in Opera 12. If all the major banks / websites target Chrome-only, we'll have no choice but to use it. And then we'll have no control as Google push new restrictions into Chrome.


And the scripts of most recent YouTube videos, and the dialogue in Stranger Things Season 5 (the last 3 episodes specifically).


Just means we'll have to run another model in front of it, to filter out the ads


>This is great. Sonnet 4.5 has degraded terribly. >I can get some useful stuff from a clean context in the web ui but the cli is just useless. >I swear it was not that awful a couple of months ago.

I agree on all 3 counts. And it still degrades after a few long turns in openwebui. You can test this by regenerating the last reply in chats from shortly after the model was released.


I love logical posts like this. There are other factors like mxfp4 in gpt-oss, mla in deepseek, etc.

>Amazon Bedrock serves Claude Opus 4.5 at 57.37

I checked the other Opus-4 models on bedrock:

Opus 4 - 18.56tps Opus 4.1 - 19.34tps

So they changed the active parameter count with Opus 4.5


Good observation!

56.37 tps / 19.34 tps ≈ 2.9

This explains why Opus 4.1 is 3 times the price of Opus 4.5.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: