Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

There has been a ton of optimization around whisper with regards to apple silicon, whisper.cpp is a good example that takes advantage of this - also this article is specifically referencing the new apple MLX framework which I’m guessing your tests with llama and stable diffusion weren’t utilizing.


I assume people are working on bringing a MLX backend to llama.cpp... Any idea what the state of that project is?


https://github.com/ml-explore/mlx-examples

Several people working on mlx-enabled backends to popular ML workloads but it seems inference workloads are the most accelerated vs generative/training.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: