Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

They seem to have another FAQ here that gives a real answer (273GB/s): https://www.asus.com/us/support/faq/1056142/


Now we can see why they avoided giving a straight answer.

File this one in the blue folder like the DGX


Noob here. Why is that number bad?


LLM performance depends on doing a lot of math on a lot of different numbers. For example, if your model has 8 billion parameters, and each parameter is one byte, then for 256gb/s you can't do better than 32 tokens per second. So if you try to load a model that's 80 gigs, you only get 3.2 tokens per second, which is kinda bad for something that costs 3-4k.

There's newer models called "Mixture of Experts" that are, say, 120b parameters, but only use 5b parameters per token (the specific parameters are chosen via a much smaller routing model). That is the kind of model that excels on this machine. Unfortunately again, those models work really well when doing hybrid inference, because the GPU can handle the small-but-computationally-complex fully connected layers while the CPU can handle the large-but-computationally-easy expert layers.

This product doesn't really have a niche for inference. For training and prototyping is another story, but I'm a noob on those topics.


My mac laptop has 400gb/s bandwidth. LLMs are bandwidth bound.


Running LLMs will be slow and training them is basically out of the question. You can get a Framework Desktop with similar bandwidth for less than a third of the price of this thing (though that isn't NVIDIA).


> Running LLMs will be slow and training them is basically out of the question

I think it's the reverse, the use case for these boxes are basically training and fine-tuning, not inference.


The use case for these boxes is a local NVIDIA development platform before you do your actual training run on your A100 cluster.


refurbished macbooks m1 for $1,500 have more with less latency




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: