Here's a purely academic question about AI / Language Models: *What is the small...

layer8 · on Dec 13, 2023

LLMs cannot perform unbounded loops (excepting some loops reducible by static analysis) unless you add a loop around the LLM. In addition, LLMs inherently have a finite input size (or a finite state, if you count the loop around it), and therefore can only compute a limited subset of Turing-computable functions. You’d have to add the “tape” for unlimited storage and allow the LLM to operate it. Given the additional loop and tape, then an LLM (or an SLM) can trivially be Turing-complete, because that only requires a small state machine that such models can easily implement.

IMO the question doesn’t make sense due to how LLMs are structured, and on the other hand not a lot is needed to make a system Turing-complete. Turing-completeness and the capabilities of LLMs are mostly orthogonal aspects.

Turing-completeness isn’t even necessarily desirable, because that would mean that the system can get stuck in an infinite loop, never producing an answer.

layer8 · on Dec 13, 2023

Edit: Regarding the last point, AI chatbots sometimes already do get into a seemingly infinite loop, outputting the same token sequence over and over, and only being stopped by the controlling software, so that wasn’t really a good argument.

nojvek · on Dec 13, 2023

There are papers written about this.

https://openreview.net/pdf/0c9a0f445d430a4d2bf9a08078c435091...

Without a loop - an LLM takes a fixed set of tokens and produces a fixed set of tokens. It has some internal state.

So without a loop an LLM cannot be a general purpose computer (Turing machine).

Turing machines are simple. They need 3 things - infinitely long tape, ability to have internal state, and conditionally write and move tape based on what is on tape.

So if an LLM had access to infinite memory, all it needs is to implement a subleq instruction which is quite simple, and run itself in a loop.

See https://en.m.wikipedia.org/wiki/One-instruction_set_computer

One instruction set computer

See this paper from DeepMind which talks about compiling existing code into transformer primitives https://arxiv.org/pdf/2301.05062.pdf

So the smallest ML model that can do arbitrary math with () * / + - operators and floating/int numbers is not that complex.

Perhaps few 1000s of params. One can compile existing code into transformer weights.

Now that you mention, I’ve been nerdsniped to building it.

renonce · on Dec 13, 2023

GPT-1 with 100 million parameters is suffucient

stavros · on Dec 13, 2023

It's interesting to see two replies to the grandfather, one saying this has existed for years, and one saying this will never exist.

renonce · on Dec 13, 2023

Wondering if anyone actually care about this? Turing-complete LMs is an idea that I’ve had for years but I’m not sure if anyone would ever be interested in this. Anyone who knows about the academic publishing process?

stavros · on Dec 13, 2023

Yep: https://arxiv.org/pdf/2301.13196.pdf

regularfry · on Dec 13, 2023

People saying "it can't be done" are always interrupted by someone doing it, but not always very well.

tobbe2064 · on Dec 13, 2023

Such a LLM model will never exist.