And none of the competitors can make this technology profitable, either.

chipgap98 · 2025-01-23T22:27:15 1737671235

Isn't there every reason to believe the cost will come down?

Volundr · 2025-01-23T22:36:03 1737671763

Is there actually reason to believe costs will come down significantly? I've been under the impression that companies like OpenAI and Google have been selling this stuff at well below cost to drive adoption with the idea that over time efficiency improvements would make it possible, but that those improvements don't seem to be materializing, but I'm not particularly informed in this so I'd love to hear a more informed take.

reissbaker · 2025-01-24T02:41:21 1737686481

The costs for OpenAI and Google aren't public, but if you look at the open-source models, inference is very cheap: for example, you can generally beat the public serverless prices by a factor of ~2 by using dedicated GPUs [1], and given that a 70b model costs about $1/million tokens serverless — and tend to perform similarly on benchmarks to 4o — OpenAI is most likely getting very fat profit margins at $2.50/million input tokens and $10/million output tokens.

The problem for them is making enough money for the training runs (where it seems like their strategy is to raise money on the hope they achieve some kind of runaway self-improving effect that grants them an effective monopoly on the leading models, combined with regulatory pushes to ban their competitors) — but it seems very unlikely to me that they're losing money serving the models.

1: https://fireworks.ai/blog/why-gpus-on-demand

neom · 2025-01-24T02:55:45 1737687345

The cost of what? Training a model or trained a served model? The cost of both benefit from economies of scale. If I had what openAI has I could imagine how to make it profitable tomorrow, because I could do that is why they HAVE to make it free without an account, to prevent anyone new from meaningfully entering the $0 to $20mth segment, they already know nobody can compete with the most advanced model.

If you look at their business strategy, it's top notch, anchor pricing on the 200, 20 sweet spot, probably costs them on average $5/mth to server the $20/mth customers, Take your $50m a year marketing budget and use it to buy servers, run a highly optimized "good enough" model that is basically just wikipedia in chatbot and you don't need to spend a dime on marketing if you don't want to, amazing top of funnel to the rest of your product line. I believe Sam when he says they're losing money on the $200/mth product, but it makes the $20/mth product look so good...

They're really playing business very well.

Yizahi · 2025-01-23T23:07:45 1737673665

NNs don't benefit from economies of scale. Or rather specifically about how a majority of low utilization users can subsidize high utilization users. In NN world every new free tier user adds the same additional performance demand as the previous free users, every free user query needs to utilize a lot of compute.

So for example, there is a ratio of 10% paid users and 90% free users (just random numbers, not real). If they want more revenue they want to add more paid users, for example double them. But this means that free users needs to double too. And every real free user requires a lot of compute for his queries. Nothing to be cached, because all are different. No way to meaningfully offer "limited" features because the main feature is the LLM, maybe it is previous gen and a little bit cheaper to run, but not much. They can't offer too old software, because competitors will offer better quality and win.

So there is no realistic way to bring costs down. Analysts forecast they actually need to increase prices a lot to meet OAI targets, or it needs to have a financial intravenous line constantly, like the 500B$ announced by Trump.