If you look at compute scaling and model improvement on that compute, we’re going to get there pretty fast.
Both compute, architecture and cost matter, and those have been improving like crazy.
I don’t know about you, but I didn’t expect GPT-4o to come out at half the price one year later, with real-time voice, image and text.
There is zero sign of slowing, Nvidia keeps building beefier GPUs specialized in LLMs, the world keeps adding more data centers and humanity keeps creating exponentially more data world wide.
You can plot this exponential growth out over time and calculate when these models will have the complexity of the brain. Then you can assume some penalty for shitty architecture (that gets better over time), and you’ll have a ballpark estimate.
Somewhere 2027-2030 we’ll have models that most of us consider AGI today. Now, societal impact is hard to predict since that’ll depend on regulators as well.
LLMs don't reason. Without reasoning, you don't have AGI.
I can't wait until we're past the peak on the hype cycle because people's understanding of what these models can and can't do or how close we are to AGI isn't grounded in reality.
It seems unclear to me what it is that LLMs can’t/don’t do. I agree that there are substantial limitations of current ones, but I have little idea of which of these limitations are fundamental to the architecture (as well as which are a result of limits of the scale, and which are a result of limits to the training data (using the current training strategies), and which are a result of the overall training methods).
Do you have some insights you could share on this question?
I mean, “they can’t reason” might be true, but I’m not exactly sure what it is that it is saying that they can’t do.
Not OP, and not a computer scientist. But amateur neuroscientist.
From what I understand, current models seem to have issue with planning and reasoning because they “blurt” everything out zero shot. They can’t recurrently process information.
Part of it is how we train these models, which is rapidly being experimented with and being improved.
Another part is that our brain can both consciously and unconsciously process information recurrently until we solve a problem. Especially helpful when we haven’t solved a problem before.
With LLMs, we can do that rudimentary with recurrent prompting. In a way, it can see the steps it has tried, re-evaluate and come up with a new try.
But it’s not innate yet, where models can “think” about a complex problem until it solved it.
Also- According to some theories around consciousness, this is when consciousness will really emerge (integrated information theory).
I’m pretty sure we’ll be able to solve this recurrent processing issue in the next few years.
> From what I understand, current models seem to have issue with planning and reasoning because they “blurt” everything out zero shot. They can’t recurrently process information.
IMO, this is not necessarily a model issue, but an interface issue - recurrently processing information is enabled by just feeding the model's output back to it, perhaps many times, before surfacing it to the user. Right now, with the interfaces we're exposed to - direct evaluation, perhaps wrapped in a chatbot UI - the user must supply the recursion themselves.
> But it’s not innate yet, where models can “think” about a complex problem until it solved it.
That's indeed the part I see us struggling with - we can make the model recursive, but we don't know how to do it unsupervised, so we can box it behind an abstraction layer, letting it work stuff out on its own and break recursion at the right moment.
I think the mental model I developed a good year ago still stands: LLM isn't best compared to a human mind, but to a human's inner voice. The bit that "blurts everything out zero-shot" into your consciousness. It has the same issues of overconfidence, hallucinations, and needing to be recursively fed back to itself.
Think about the difference between what an LLM is doing and a human is doing in response to being told they are wrong.
If you tell a human, “You are wrong on this, and here is why.” They may reject it, but their mental model is updating with the new information that you, the speaker, think X for YZQ reasons. Their response is based on a judgment of how trustworthy you are and the credibility of the evidence.
For an LLM, the response is not based on these logical connections, but simply the additional prompt context of YZQ tokens being close to each other.
This is not “logic” in any traditional sense or in the sense of how a human incorporates and responds to this new information.
The LLM’s method of responding is also inherent to the architecture of the model. It’s predicting tokens based on input. It’s not reasoning.
Critically, this flaw is inherent in all LLM output. Giving an LLM’s output the power to affect real world activities means trusting that the decision can be made by sophisticated word association rather than more complex reasoning.
There may be lots of decisions where word association is all you need, but I doubt that is the case for all decisions humans make.
And still, Yann LeCun (head of AI at Meta, renowned AI/ML researcher) is convinced that we are far from reaching AGI. He makes convincing arguments, especially around the fact that we are not able to expose models to the amount of redundant information that even a young kid is exposed to.
I guess we'll see some shifting goals around "what is even AGI". You say the we'll have soon "models that most of us consider AGI today", but what does that even mean?
The point from Yann LeCun I find interesting is the negative space argument where as training data / parameters increase, the "best" next tokens represent a smaller and smaller slice of the model. His contention is therefore more hallucinations, more places to get stuck on some less best next tokens, etc and interesting to think about this as the opposite of how scaling laws are typically presented. A lot of smart people stabbing around in the dark right now and only time (and gazillions in GPUs) will tell.
They are trained on a ridiculously small amount of content compared to what the brain of a 3 years child is subject to. Current models still have a very narrow application field compared to what a human can achieve.
AGI seems like a stretch when none of the LLM’s right now can solve the prompt:
“
Write a sentence that has an odd number of words, with the third word being "keyboard", the last word being "anyway", the next-to-last word being a number, and the first word being a palindrome
“
Do we have to add these examples to the dataset now, so that they can solve this? Then maybe we find another prompt and add that too, as a human I’ve never seen such a question before, why do I seem to be able to do it so easily? We are creating some form of intelligence that will have a massive economic impact, but I doubt we are creating human like intelligence that has the ability of being far better than humans along every dimension, which is what super intelligence is
I would prefer if it can give the number 5 instead of five. Also I don’t like to tell it what exactly it got wrong, just that it got it wrong. Saying things like think step by step, focus on rules etc is all fair game, while saying “keyboard is the second word, not third”, imo is not. I tried quite a bit when I saw this example on twitter and could never get it to work.
Why is this a goalpost for just about anything? This seems completely uncorrelated, and I'm sure many fully grown adults would fail under similar timing conditions too.
> You can plot this exponential growth out over time and calculate when these models will have the complexity of the brain. Then you can assume some penalty for shitty architecture (that gets better over time), and you’ll have a ballpark estimate.
The same thing could’ve been said for self driving cars, or the space program, or a lot of things that seemed to be progressing quickly at the time.
And if you did say it, you would have been a lot more correct than if you'd said they won't amount to anything. Robotaxis with nobody in the driver's seat are available in three major US cities; most people use a network of navigational satellites every time they want to figure out how to get somewhere new.
Not really. You can only make these predictions about things bottlenecked by things that improve exponentially such as compute.
Neither the space program nor self driving are compute restrained.
The latter will probably be “solved” as we get closer to AGI, since you need some sort of human like reasoning for edge cases that require reasoning.
Another tech like this is batteries: There is no miracle jump in production batteries. They just improve about 10% YoY, both in energy density and cost.
So you can extrapolate when electric cars will be cheaper than gas cars to buy.
Even sam says they are 2 breakthroughs before AGI is possible. Who knows how difficult they are to make that leap. Humans are still highly capable comparing to LLMs
And let’s not forget that Sam Altman is the CEO of OpenAI. He clearly has a vested interest in creating hype around AI, AGI, etc. But when i point this out on Reddit, for some reason i get downvoted.
Both compute, architecture and cost matter, and those have been improving like crazy.
I don’t know about you, but I didn’t expect GPT-4o to come out at half the price one year later, with real-time voice, image and text.
There is zero sign of slowing, Nvidia keeps building beefier GPUs specialized in LLMs, the world keeps adding more data centers and humanity keeps creating exponentially more data world wide.
You can plot this exponential growth out over time and calculate when these models will have the complexity of the brain. Then you can assume some penalty for shitty architecture (that gets better over time), and you’ll have a ballpark estimate.
Somewhere 2027-2030 we’ll have models that most of us consider AGI today. Now, societal impact is hard to predict since that’ll depend on regulators as well.