I’ve got a Max M3 with 64 GB ram and can run more than just toy models, even if ...

embedding-shape · 2025-12-29T01:21:49 1766971309

> Honestly, I think local LLMs are the future and we are just going to be doing hosted

Same here, otherwise I wouldn't be investing in local hardware :) But I'd be lying if I said I think it's ready for that today. I don't think the hardware as much to catch up with, it's the software that has a bunch of low hanging fruits available for performance and resource usage, since every release seems to favor "time to paper" above all else.

seanmcdirmid · 2025-12-29T02:23:14 1766974994

There are lots of things you can do on local hardware already, and you don’t have to worry about safeguards or token limits. There are lots of crazy models, especially Chinese ones, that have a lot of capabilities and aren’t just there for academic papers.

embedding-shape · 2025-12-29T11:45:56 1767008756

Again, put those under test with your private benchmarks, then compare the results with hosted models.

I'm not saying it's completely useless, or that I don't think it won't be better in the future. What I am saying is that even the top "weights available" models today really don't come close to today's SOTA. This is very clear when you have benchmarks to get hard concrete numbers that aren't influenced by public benchmarking data.

seanmcdirmid · 2025-12-29T23:13:24 1767050004

> even the top "weights available" models today really don't come close to today's SOTA.

This is the statement thatI'm disagreeing with. They do come close, even if they are somehow less, it is a fixed distance away where the hosted models aren't more than a magnitude better. Hosted models are still better, just not incredibly so.