Kimi K2 Thinking, MiniMax M2 Interleaved Thinking: open models are reaching, or have reached, frontier territory. We now have GPT and Claude Sonnet capable at home, as they are open-weight. Around this time last year, we had the DeepSeek moment, Now is the time for another moment.
Benchmarks show that open models are equal to SOTA closed ones but own experience and real world use shows the opposite. And I really wish they were closer, I run GPT-OSS 120b as a daily driver
It could be that inference remote providers has issue, hence the model could not show potential or rate limited. I also think the Moonshot could take more time and continue with K2.1 or something like with DeepSeek.