take millions of games of human players of certain rating only as your learning ...

Someone · on Oct 18, 2024

In the context of this thread (“non-GM level computer chess”, which I read as also excluding International, FIDE Master, and Candidate Master (https://en.wikipedia.org/wiki/Grandmaster_(chess))), I think it’s more important to not have a good learning algorithm.

Even 10 thousand of such games may already have way more tactics than a player at the targeted level can detect and apply. If so, a learning algorithm that detects and remembers all of them already will be better than the target level.

netdevnet · on Oct 18, 2024

Exactly. Level x (whatever scalar thing the user meant by that) doesn't quite work out for the reason you outlined. X Level Players have different tactics and someone that can use all of them will likely be better than most if not all those those players. I got downvoted for saying that. Maybe I didn't phrase it as well as you did

wavemode · on Oct 18, 2024

Yeah but, won't it also be learning from the mistakes and missed tactics too? (Assuming its reward function is telling it to predict the human's move, rather than actually trying to win.)

WithinReason · on Oct 18, 2024

condition the move on ELO while training

netdevnet · on Oct 18, 2024

You are assuming that's going to be a reliable proxy, what would make you think that?