Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

we completely agree - mechanistic interpretability might help keep these language models in check, but it’s going to be very difficult to run this on closed source frontier models. im excited to see where that field progresses


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: