Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It seems unclear to me what it is that LLMs can’t/don’t do. I agree that there are substantial limitations of current ones, but I have little idea of which of these limitations are fundamental to the architecture (as well as which are a result of limits of the scale, and which are a result of limits to the training data (using the current training strategies), and which are a result of the overall training methods).

Do you have some insights you could share on this question?

I mean, “they can’t reason” might be true, but I’m not exactly sure what it is that it is saying that they can’t do.



Not OP, and not a computer scientist. But amateur neuroscientist.

From what I understand, current models seem to have issue with planning and reasoning because they “blurt” everything out zero shot. They can’t recurrently process information.

Part of it is how we train these models, which is rapidly being experimented with and being improved.

Another part is that our brain can both consciously and unconsciously process information recurrently until we solve a problem. Especially helpful when we haven’t solved a problem before.

With LLMs, we can do that rudimentary with recurrent prompting. In a way, it can see the steps it has tried, re-evaluate and come up with a new try.

But it’s not innate yet, where models can “think” about a complex problem until it solved it.

Also- According to some theories around consciousness, this is when consciousness will really emerge (integrated information theory).

I’m pretty sure we’ll be able to solve this recurrent processing issue in the next few years.


> From what I understand, current models seem to have issue with planning and reasoning because they “blurt” everything out zero shot. They can’t recurrently process information.

IMO, this is not necessarily a model issue, but an interface issue - recurrently processing information is enabled by just feeding the model's output back to it, perhaps many times, before surfacing it to the user. Right now, with the interfaces we're exposed to - direct evaluation, perhaps wrapped in a chatbot UI - the user must supply the recursion themselves.

> But it’s not innate yet, where models can “think” about a complex problem until it solved it.

That's indeed the part I see us struggling with - we can make the model recursive, but we don't know how to do it unsupervised, so we can box it behind an abstraction layer, letting it work stuff out on its own and break recursion at the right moment.

I think the mental model I developed a good year ago still stands: LLM isn't best compared to a human mind, but to a human's inner voice. The bit that "blurts everything out zero-shot" into your consciousness. It has the same issues of overconfidence, hallucinations, and needing to be recursively fed back to itself.


Think about the difference between what an LLM is doing and a human is doing in response to being told they are wrong.

If you tell a human, “You are wrong on this, and here is why.” They may reject it, but their mental model is updating with the new information that you, the speaker, think X for YZQ reasons. Their response is based on a judgment of how trustworthy you are and the credibility of the evidence.

For an LLM, the response is not based on these logical connections, but simply the additional prompt context of YZQ tokens being close to each other.

This is not “logic” in any traditional sense or in the sense of how a human incorporates and responds to this new information.

The LLM’s method of responding is also inherent to the architecture of the model. It’s predicting tokens based on input. It’s not reasoning.

Critically, this flaw is inherent in all LLM output. Giving an LLM’s output the power to affect real world activities means trusting that the decision can be made by sophisticated word association rather than more complex reasoning.

There may be lots of decisions where word association is all you need, but I doubt that is the case for all decisions humans make.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: