Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

For a bit, waiting for LLMs was like waiting for code to compile: https://xkcd.com/303/

> more than 1000 tokens per second

Perhaps, no more?

(Not to mention, if you're waiting for one LLM, sometimes it makes sense to multi-table. I think Boris from Anthropic says he runs 5 CC instances in his terminal and another 5-10 in his browser on CC web.)

 help



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: