Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I think tuning the sampler temperature and using top-k over top-p sounds ad hoc and shouldn’t be necessary for a solid model. Do you have any reason for suggesting those changes in particular? Especially since top-p, or nucleus sampling, is meant to be an improvement over top-k.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: