Depending on the definition of "nicely", FWIW I currently run Ollama sever [1] + Qwen Coder models [2] with decent success compared to the big hosted models. Granted, I don't utilize most "agentic" features and still mostly use chat-based interactions.
The server is basically just my Windows gaming PC, and the client is my editor on a macOS laptop.
Most of this effort is so that I can prepare for the arrival of that mythical second half of 2026!
Thanks for sharing your setup! I'm also very interested in running AI locally. In which contexts are you experiencing decent success? eg debugging, boilerplate, or some other task?
I'm running qwen via ollama on my M4 Max 14 inch with the OpenWebUI interface, it's silly easy to set up.
Not useful though, I just like the idea of having so much compressed knowledge on my machine in just 20gb. In fact I disabled all Siri features cause they're dogshit.
The server is basically just my Windows gaming PC, and the client is my editor on a macOS laptop.
Most of this effort is so that I can prepare for the arrival of that mythical second half of 2026!
[1] https://github.com/ollama/ollama/blob/main/docs/faq.md#how-d...
[2] https://huggingface.co/collections/Qwen/qwen25-coder-66eaa22...