The 13b and 30b run quite well on a 4090 at 4-bit quantization. | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		0xbadc0de5 on March 10, 2023 \| parent \| context \| favorite \| on: Llama.cpp: Port of Facebook's LLaMA model in C/C++... The 13b and 30b run quite well on a 4090 at 4-bit quantization.

thewataccount on March 10, 2023 [–]

Ah dang I missed that I was still using the 8bit mode, I'll look into that thanks!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact