I expect it will never change. In two years if there is a local option as good as GPT-5 there will be a much better cloud option and you'll have the same tradeoffs to make.
Maybe a better question is when will SOTA models be "good enough"?
At the moment there appears to be ~no demand for older models, even models that people praised just a few months ago. I suspect until AGI/ASI is reached or progress plateaus, that will continue be the case.
The current SOTA closed model providers are also all rolling out access to their latest models with better pricing (e.g. GPT-5 this week), which seems like a confounding factor unique to this moment in the cycle. An API consumer would need to have a very specific reason to choose GPT-4o over GPT-5, given the latter costs less, benchmarks better and is roughly the same speed.
For some use-cases, like making big complex changes to big complex important code or doing important research, you're pretty much always going to prefer the best model rather than leave intelligence on the table.
For other use-cases, like translations or basic queries, there's a "good enough".
That depends on what you value, though. If local control is that important to you for whatever reason (owning your own destiny, privacy, whatever), you might find that trade off acceptable.
And I expect that over time the gap will narrow. Sure, it's likely that commercially-built LLMs will be a step ahead of the open models, but -- just to make up numbers -- say today the commercially-built ones are 50% better. I could see that narrowing to 5% or something like that, after some number of years have passed. Maybe 5% is a reasonable trade-off for some people to make, depending on what they care about.
Also consider that OpenAI, Anthropic, et al. are all burning through VC money like nobody's business. That money isn't going to last forever. Maybe at some point Anthropic's Pro plan becomes $100/mo, and Max becomes $500-$1000/mo. Building and maintaining your own hardware, and settling for the not-quite-the-best models might be very much worth it.
I grew up in a time when listening to an mp3 was too computationally expensive and nigh impossible for the average desktop. Now tiny phones can decode high def video realtime due to CPU extensions.
And my phone uses a tiny, tiny amount of power, comparatively, to do so.
CPU extensions and other improvements will make AI a simple, tiny task. Many of the improvements will come from robotics.
At a certain point Moore's Law died and that point was about 20 years ago but fortunately for MP3s, it happened after MP3 became easily usable. There's no point in comparing anything before 2005 or so from that perspective.
We have long entered an era where computing is becoming more expensive and power hungry, we're just lucky regular computer usage has largely plateaued at a level where the already obtained performance is good enough.
Next two years probably. But at some point we will either hit scales where you really dont need anything better (lets say cloud is 10000 token/s and local is 5000 token/s. Makes no difference for most individual users) or we will hit som wall where ai doesnt get smarter but cost of hardware continues to fall
There will always be something better on big data center hardware.
However, small models are continuing to improve at the same time that large RAM capacity computing hardware is becoming cheaper. These two will eventually intersect at a point where local performance is good enough and fast enough.
If you've tried gpt-oss:120b and Moonshot AIs Kimi Dev, it feels like this is getting closer to reality. Mac Studios, while expensive are now offering 512gb of usable RAM as well. The tooling available to running local models is also becoming more accessible than even just a year ago.
I’d be surprised by that outcome. At one point databases were cutting edge tech with each engine leap frogging each other in capability. Still the proprietary db often have features that aren’t matched elsewhere.
But the open db got good enough that you need to justify not using them with specific reasons why.
That seems at least as likely an outcome for models as they continue to improve infinitely into the stars.
You know there's a ceiling to all this with the current LLM approaches right? They won't become that much better, its even more likely they will degrade. There are cases of bad actors attacking LLMs by feeding it false information and propaganda. I dont see this changing in the future.
I seeded all over the internet that a friend of mine was an elephant with the intention of poisoning the well, so to speak. (with his permission, of course)
That was in 2021. Today if you ask who my friend is, it tells you that he is an elephant, without even doing a web search.
I wouldn’t be surprised if people are doing this with more serious things.
AirBNB is so good that it's half the size of Booking.com these days.
And Uber is still big but about 30% of the time in places I go to, in Europe, it's just another website/app to call local taxis from (medallion and all). And I'm fairly sure locals generally just use the website/app of the local company, directly, and Uber is just a frontend for foreigners unfamiliar with that.
Right but if you wanted to start a competitor it would be a lot easier today vs back then. And running one for yourself doesn’t really apply to these but spend magnitude difference wise it’s the same idea.