He is literally comparing whisper.cpp on the 4090 with an optimized-for-apple-si...

isodev · on Dec 13, 2023

Maybe I’m not seeing it right, but comparing the source of Apple’s Whisper to Python Whisper seems there are minimal changes to redirect certain operations to using MLX.

There is also cpp Whisper (https://github.com/ggerganov/whisper.cpp) which seems to have it’s own kind of optimizations for Apple Silicon - I don’t think this was the one used with Nvidia during the test.

rowanG077 · on Dec 13, 2023

I don't think whisper was optimized for apple silicon. Doesn't it just use MLX? I mean if using an API for a platform counts as specifically optimized then the Nvidia version is "optimized" as well since it's probably using CUDA.