Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

A better solution would train/finetune the smaller model from the responses of the larger model and only push to the inference to the edge if the smaller model is performant and the hardware specs can handle the workload?


yeah, that'd b nice, some kind of self-bootstrapping system where you start with a strong cloud model, then fine-tune a smaller local one over time until it’s good enough to take over. tricky part is managing quality drift and deciding when it's 'good enough' without tanking UX. edge hardware's catching up though, so feels more feasible by the day.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: