The arithmetic issues are well documented and understood; it's a problem of sub-token manipulation, which has nothing to do with reasoning. (Similar to calling blind people unintelligent because they can't read the iq test.)
And the better llms can easily write code to do the attention that they suck at...
Excellent anology. LLMs are capable of many extraordinary things, and it’s a shame people dismiss them because they fail to live up to some specific test they invented.
And the better llms can easily write code to do the attention that they suck at...