Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Here's a purely academic question about AI / Language Models:

What is the smallest, tiniest amount of parameters that will result in a language model which understands how to add, multiply, subtract, divide, handle "if..then" type constructs, and perform loops?

In other words, what is the smallest amount of parameters for a language model that will result in a Turing-complete (https://en.wikipedia.org/wiki/Turing_completeness) AI LM -- regardless of how few words it would recognize?

?

I submit that one to all of the (L)LM AI researchers in the world.

Tiniest Turing-complete Language Model, please!



LLMs cannot perform unbounded loops (excepting some loops reducible by static analysis) unless you add a loop around the LLM. In addition, LLMs inherently have a finite input size (or a finite state, if you count the loop around it), and therefore can only compute a limited subset of Turing-computable functions. You’d have to add the “tape” for unlimited storage and allow the LLM to operate it. Given the additional loop and tape, then an LLM (or an SLM) can trivially be Turing-complete, because that only requires a small state machine that such models can easily implement.

IMO the question doesn’t make sense due to how LLMs are structured, and on the other hand not a lot is needed to make a system Turing-complete. Turing-completeness and the capabilities of LLMs are mostly orthogonal aspects.

Turing-completeness isn’t even necessarily desirable, because that would mean that the system can get stuck in an infinite loop, never producing an answer.


Edit: Regarding the last point, AI chatbots sometimes already do get into a seemingly infinite loop, outputting the same token sequence over and over, and only being stopped by the controlling software, so that wasn’t really a good argument.


There are papers written about this.

https://openreview.net/pdf/0c9a0f445d430a4d2bf9a08078c435091...

Without a loop - an LLM takes a fixed set of tokens and produces a fixed set of tokens. It has some internal state.

So without a loop an LLM cannot be a general purpose computer (Turing machine).

Turing machines are simple. They need 3 things - infinitely long tape, ability to have internal state, and conditionally write and move tape based on what is on tape.

So if an LLM had access to infinite memory, all it needs is to implement a subleq instruction which is quite simple, and run itself in a loop.

See https://en.m.wikipedia.org/wiki/One-instruction_set_computer

One instruction set computer

See this paper from DeepMind which talks about compiling existing code into transformer primitives https://arxiv.org/pdf/2301.05062.pdf

So the smallest ML model that can do arbitrary math with () * / + - operators and floating/int numbers is not that complex.

Perhaps few 1000s of params. One can compile existing code into transformer weights.

Now that you mention, I’ve been nerdsniped to building it.


GPT-1 with 100 million parameters is suffucient


It's interesting to see two replies to the grandfather, one saying this has existed for years, and one saying this will never exist.


Wondering if anyone actually care about this? Turing-complete LMs is an idea that I’ve had for years but I’m not sure if anyone would ever be interested in this. Anyone who knows about the academic publishing process?



People saying "it can't be done" are always interrupted by someone doing it, but not always very well.


Such a LLM model will never exist.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: