Eh, I’d expect it to be slower if anything. Think about it like this. If you write an image renderer, bitmap would be the fastest, because it’s already decompressed. 4-bit quantization is a compression algorithm.
It depends on the details of memory bandwidth vs compute though.
It depends on the details of memory bandwidth vs compute though.