Let's say a particular layperson wants to execute a task.
He gives (INPUT <=> OUTPUT) pairs. chatgpt creates a "prompt ( == bytecode)" which captures the essence of those transformations
This process is called "Program Fitting" similar to Line fitting or Curve fitting given list of data points.
Then this bytecode can then be efficiently run on a smaller distilled CVM (chatgpt virtual machine)
diligently chosen by ChatGPT itself since it knows which CVM to best execute the task
and then run the (bytecode = prompt) on new similar data.
No need to run full ChatGPT. ChatGPT creates its own MoE setups.