Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I agree the claim is (perhaps purposefully) confusing.

What they achieved is to create tiny student models. Trained on specific set of input. Off the teacher model's output.

There is clearly novelty in the method and what it achieve. Whether what it achieve would cover many cases that's another question.



Can you please share the relevant code that has the training of such a tiny student model that can operate independently of the big teacher model after training? The repository has no such code.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: