Welp, my data point of one shows you need more than 8 GB of vRam. When I run mis...

neonbrain · on July 19, 2024

When I run mistral-chat on Ubuntu 22.04 after cleaning up some smaller processes from the GPU (like gnome-remote-desktop-daemon) I am able to start Mistral-Nemo 2407 and get a Prompt on RTX 4090, but after entering the prompt it still fails with OOM, so, as someone noted, it narrowly fits 4090.

peterleiser · on July 20, 2024

Agreed, it narrowly fits on RTX 4090. Yesterday I rented an RTX 4090 on vast.ai and setup Mistral-Nemo-2407. I got it to work, but just barely. I can run mistral-chat, get the prompt, and it will start generating a response to the prompt after 10 to 15 seconds. The second prompt always causes it to crash immediately from OOM error. At first I almost bought an RTX 4090 from Best Buy, but it was going to cost $2,000 after tax, so I'm glad that instead I only spent 40 cents.