Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The "coding benchmarks" like "SWE-verified" are actually of very low quality and the answer riddled with problems.

Good Explainer: "The Disturbing Reality of AI Coding" https://www.youtube.com/watch?v=QnOc_kKKuac



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: