Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> And if for some ungodly reason you had to do it in Python

I literally invoke sglang and vllm in Python. You are supposed to (if not using them over-the-network) use the two fastest inference engines there is via Python.





Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: