Thanks for the quick reply! About hardware support, I was wondering if the LPU h... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		vimarsh6739 on Feb 19, 2024 \| parent \| context \| favorite \| on: Groq runs Mixtral 8x7B-32k with 500 T/s Thanks for the quick reply! About hardware support, I was wondering if the LPU has a hardware instruction to compute the attention matrix similar to the MatrixMultiply/Convolve instruction in the TPU ISA. (Maybe a hardware instruction which fuses a softmax on the matmul epilogue?)

tome on Feb 19, 2024 [–]

We don't have a hardware instruction but we do have some patented technology around using a matrix engine to efficiently calculate other linear algebra operations such as convolution.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact