FWIW, Ollama already does most of this: - Cross-platform - Sets up a local API s...

randallsquared · 2025-12-21T22:50:48 1766357448

I assumed from the name that it only ran llama-derived models, rather than whatever is available at huggingface. Is that not the case?

fenykep · 2025-12-21T23:04:53 1766358293

No, they have quite a broad list of models: https://ollama.com/search

[edit] Oh and apparently you can also directly run some models directly from HuggingFace: https://huggingface.co/docs/hub/ollama

ashirviskas · 2025-12-23T02:30:21 1766457021

Just use llama.cpp. Ollama tried to force their custom API (not the openai standard), they obscure the downloaded models making them a pain to use with other implementations, blatantly used llama.cpp as a thin wrapper without communicating it properly and now has to differentiate somehow to start making money.

If you've ever used a terminal, use llama.cpp. You can also directly run models from llama.cpp afaik.

fenykep · 2025-12-23T13:19:35 1766495975

Yes, I wanted to try it already but setting up an environment with an MI50 was a bit tricky so I wanted to try something I knew first. Now that I have ollama running I will give llama.cpp a shot.

ashirviskas · 2025-12-23T14:15:24 1766499324

Ooh, I have experience with it. If you're on linux, just use Vulkan. If you face any other issues, just google my username + "MI50 32GB vbios reddit". It depends on which vBIOS you have, but that post on reddit has most of the info you may need. Good luck!