Hacker Newsnew | past | comments | ask | show | jobs | submit | raymond513's submissionslogin
1.FireAttention – Serving Mixtral and open-source MoE models at 4x speed vs. vLLM (fireworks.ai)
3 points by raymond513 on Jan 9, 2024 | past

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: