More

JanSt · 2025-12-11T18:42:06 1765478526

I don't feel the S-curve at all yet. Still an exponential for me

exe34 · 2025-12-11T23:10:24 1765494624

With a very long doubling time?

JanSt · 2025-12-11T18:36:24 1765478184

The benchmarks are very impressive. Codex and Opus 4.5 are really good coders already and they keep getting better.

No wall yet and I think we might have crossed the threshold of models being as good or better than most engineers already.

GDPval will be an interesting benchmark and I'll happily use the new model to test spreadsheet (and other office work) capabilities. If they can going like this just a little bit further, much of the office workers will stop being useful.... I don't know yet how to feel about this.

Great for humanity probably but but for the individuals?

llmslave · 2025-12-11T18:42:16 1765478536

Yeah theres no wall on this. It will be able to mimic all of human behavior given proper data.

sheeshe · 2025-12-11T19:02:26 1765479746

Ok so why isn’t there mass lay offs ensuing right now?

ghosty141 · 2025-12-11T19:37:54 1765481874

Because from my experience using codex in a decently complex c++ environment at work, it works REALLY well when it has things to copy. Refactorings, documentation, code review etc. all work great. But those things only help actual humans and they also take time. I estimate that in a good case I save ~50% of time, in a bad case it's negative and costs time.

But what I generally found, it's not that great at writing new code. Obviously an LLM can't think and you notice that quite quickly, it doesn't create abstractions, use abstractions or try to find general solution to problems.

People who get replaced by Codex are those who do repetitive tasks in a well understood field. For example, making basic websites, very simple crud applications etc..

I think it's also not layoffs but rather companies will hire less freelancers or people to manage small IT projects.

ionwake · 2025-12-11T18:46:50 1765478810

it was only about 2-3 weeks when several HNers told me "nah you better re-check your code", when I explained I have over 2 decades xp of coding, yet have not manually edited code (in memory) for the last 6 or so months, whilst performing daily 12 hour daily vibe code seshes

ipsum2 · 2025-12-11T19:03:12 1765479792

It really depends on the complexity of code. I've found models (codex-5.1-max, opus 4.5) to be absolutely useless writing shaders or ML training code, but really good at basic web development.

nineteen999 · 2025-12-11T21:41:50 1765489310

Interesting, I've been using Claude Max with UE5 and while it isn't _brilliant_ with shaders I can usually get it to where I want. Also had a bit of success with converting HLSL shaders to GLSL with it.

ipsum2 · 2025-12-12T06:31:07 1765521067

I've asked it to write some non-trivial three.js code and have not gotten it to succeed.

ionwake · 2025-12-22T11:02:38 1766401358

i got it to write some shaders in js and some three.js and it fixed something I had previously never been able to do.

sheeshe · 2025-12-11T19:04:58 1765479898

Which is no surprise as the data for web development stuff exists in large amounts on the web that the models feed off.

osn9363739 · 2025-12-11T22:51:09 1765493469

Do you have any examples or are your project oss or anything like that? Because I want to believe, but I have people I work with that say and try the same thing (no manual coding), and their work is now terrible.

ionwake · 2025-12-14T15:56:49 1765727809

Ive finally fixed some massive issues in projects that were taking me literally years, Ill be super happy to share once they are ready ( I cant really show my trading app but the game should be fine as soon as I do).

JanSt · 2025-11-05T08:44:52 1762332292

I have canceled my Claude Max subscription because Sonnet 4.5 is just too unreliable. For the rest of the month I'm using Opus 4.1 which is much better but seems to have much lower usage limits than before Sonnet 4.5 was released. When I hit 4.1 Opus limits I'm using Codex. I will probably go through with the Codex pro subscription.

virtualritz · 2025-11-05T14:41:01 1762353661

> [...] I'm using Opus 4.1 which is much better but seems to have much lower usage limits than before Sonnet 4.5 was released [...]

Yes, it's down from 40h/week to 3-5h/week on Max plan, effectively. A real bummer. See my comment here [1] regarding [2].

[1] https://news.ycombinator.com/item?id=45604301

[2] https://github.com/anthropics/claude-code/issues/8449

JanSt · 2025-11-07T09:16:58 1762507018

Thanks, didn't know that but aligns with my experience

thot_experiment · 2025-11-05T20:01:25 1762372885

Glad I'm not imagining it, I'll be cancelling my sub. Paying for things only for them to get worse and the provider hoping I don't notice is such a fucking vile tactic.

In my experience sonnet 4.5 is basically pointless, it often gets non-trivial tasks wrong, and for trivial tasks I can use a local model or one of the myriad of providers that give free inference.

EDIT: Holy shit I read the github issue, fuck these people.

> We highly recommend Sonnet 4.5 -- Opus uses rate limits faster, and is not as capable for coding tasks.

They're just straight gaslighting us now lmao.

CuriouslyC · 2025-11-05T09:17:22 1762334242

Definitely do it. You get a lot of deep research, access to GPT5 Pro, Sora and the Codex limits are MUCH higher.

lukan · 2025-11-05T11:19:57 1762341597

Curious why this is downvoted? Wrong information?

CuriouslyC · 2025-11-05T11:23:21 1762341801

Don't try to comprehend the hive mind brother, there are a lot of shills and fanboys in addition to a lot of great people on this forum, sometimes the variance looks pretty bad.

I hope the people downvoting get some minor joy out of it, I know you need it.

mccoyb · 2025-11-05T13:49:28 1762350568

Sonnet 4.5 is way worse than Opus 4.1 -- it's incredible that they claim it's their best coding model.

It's obvious if you've used the two models for any sort of complicated work.

Codex with GPT-5 codex (high thinking) is better than both by a long shot, but takes longer to work. I've fully switched to Codex, and I used Claude Code for the past ~4 months as a daily driver for various things.

I only reach for Sonnet now if Codex gets cagey about writing code -- then I let Sonnet rush ahead, and have Codex align the code with my overall plan.

prophesi · 2025-11-05T14:55:49 1762354549

Opus 4.1 is better, but imo not 5 to 6 times the price better.

JanSt · 2025-10-30T13:17:16 1761830236

I think Opus 4.1 is still much better than Sonnet 4.5

jasonjmcghee · 2025-10-31T19:08:25 1761937705

If cost is not considered- absolutely. That being said sonnet 4.5 and using thinking where it makes sense feels like way more bang for your buck and usually good enough. I really don't use opus anymore

JanSt · 2025-10-25T19:00:55 1761418855

I'm only hosting a newly started side project there, but it does have paying users so I'm really unhappy at the moment.

999900000999 · 2025-10-25T21:14:59 1761426899

Do you have any backups aside from what’s on Superbase ?

JanSt · 2025-10-25T18:56:54 1761418614

This is a worst case scenario. Even worse is the no-communication and not turning off the restore function. This is having serious economic impact

JanSt · 2025-10-23T22:33:24 1761258804

My current top 3 apple software flaws:

1) battery warning above tabs in browser with no x to close it

2) WebKit bugs that make inputs and visual diverge so you have to click under the input to hit it

3) flickering email app when it’s opened

JanSt · 2025-10-10T21:49:20 1760132960

They also managed to introduce regressions into WebKit so that the visual and touch positions of fixed input elements diverge. Really makes you question what’s going on at Apple.

JanSt · 2025-07-09T18:23:51 1752085431

Pushing out an exact way to extract that data without giving the creator time to fix it may even be worse than using such code in production. The data may than be in the hands of malicious people who wouldn’t have found it otherwise

bravetraveler · 2025-07-09T18:28:04 1752085684

Go talk to the abuse contact, I won't stop you

JanSt · 2025-07-09T18:15:58 1752084958

Doesn’t supabase provide security warnings on its dashboard?

tomashubelbauer · 2025-07-09T18:41:01 1752086461

There are security advisories, but the feature isn't particularly good. Non-actionable stuff is mixed in with actionable stuff and actionable stuff is IMO presented too generically.

coal320 · 2025-07-09T18:26:07 1752085567

I guess not? I've never used it before.