Last year's models were bolder. Eg. Sonnet-3.7(thinking), 10 times got it right without hedging:
>You should drive your car to the car wash. Even though it's only 50 meters away (which is very close), you'll need your car physically present at the car wash to get it washed. If you walk there, you'll arrive without your car, which wouldn't accomplish your goal of getting it washed.
>You'll need to drive your car to the car wash. While 50 meters is a very short distance (just a minute's walk), you need your car to actually be at the car wash to get it washed. Walking there without your car wouldn't accomplish your goal!
etc. The reasoning never second-guesses it either.
Yes, it does exactly that. It also sends other prompts like generating 3 options to choose from, prefilling a reply like 'compile the code', etc. (I can confirm this because I connect CC to llama.cpp and use it with GLM-4.7. I see all these requests/prompts in the llama-server verbose log.)
You can stop most of this with
export DISABLE_NON_ESSENTIAL_MODEL_CALLS=1
And might as well disable telemetry, etc:
export CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1
I also noticed every time you start CC, it sends off > 10k tokens preparing the different agents. So try not to close / re-open it too often.
100% agreed, and I've been explaining this to people for the past year.
I have an iPhone now and miss Firefox for Android (with Ublock, sponsorblock, etc). But this painful restriction is the only thing stopping Chrome from becoming the new IE6.
At a few startups I've worked for, the devs all use chrome exclusively, and only test in chrome during development.
The only reason they consider other browsers, is because of Safari on iOS. Sometimes it's driven by support calls / complains from iOS users after a release. If Chrome's engine is allowed on iOS, that means support can just tell the users to install Chrome (like they do now if anyone has issues on Windows in other browsers). This means Firefox will usually work as well.
Many years ago, I was able to swap banks when my bank's website stopped working in Opera 12. If all the major banks / websites target Chrome-only, we'll have no choice but to use it. And then we'll have no control as Google push new restrictions into Chrome.
>This is great. Sonnet 4.5 has degraded terribly.
>I can get some useful stuff from a clean context in the web ui but the cli is just useless.
>I swear it was not that awful a couple of months ago.
I agree on all 3 counts. And it still degrades after a few long turns in openwebui. You can test this by regenerating the last reply in chats from shortly after the model was released.
>You should drive your car to the car wash. Even though it's only 50 meters away (which is very close), you'll need your car physically present at the car wash to get it washed. If you walk there, you'll arrive without your car, which wouldn't accomplish your goal of getting it washed.
>You'll need to drive your car to the car wash. While 50 meters is a very short distance (just a minute's walk), you need your car to actually be at the car wash to get it washed. Walking there without your car wouldn't accomplish your goal!
etc. The reasoning never second-guesses it either.
A shame they're turning it of in 2 days.
reply