Yeah it doesn't win in every category. I will say watching it in the discord I saw its performance vary widely so the context and sys prompt plays a huge role. Initially it did great and solved some pretty heavy logic questions but after the context was loaded with trolling it degraded quite a bit and couldn't solve problems it previously was able to.