Apparently designed for mobile inference too, I've heard the weights on the nano model were quantized down to uint4.
Will be exciting to see how all of that plays out in terms of 'LLMs on phones', going forward.
People who know me know that I can be pretty curmudgeony about a lot of various technological things, but I really think that this could be a hard core paradigm shift in terms of mobile capabilities, lol.
Like, the real story here is the next step in the evolution of the role of mobile devices in people's lives, this is one of the biggest/clearest/most official 'shotd across the bow' that one could make for something like this, I think, lol.
Will be exciting to see how all of that plays out in terms of 'LLMs on phones', going forward.
People who know me know that I can be pretty curmudgeony about a lot of various technological things, but I really think that this could be a hard core paradigm shift in terms of mobile capabilities, lol.
Like, the real story here is the next step in the evolution of the role of mobile devices in people's lives, this is one of the biggest/clearest/most official 'shotd across the bow' that one could make for something like this, I think, lol.