This seems very plausible to me. The context window improved a lot between GPT-3...

roflyear · on March 25, 2023

Explain how it improved. Yes, it has improved accuracy (test taking) but it still gets stuck in the same loops over and over:

"response has issue A" -> point out issue A to GPT "response has issue B" -> point out issue B to GPT

GPT replies with the response that had issue A ...

This is not a tool that is going to be good at performing generic tasks. It just isn't.

sd9 · on March 26, 2023

I said that the context window improved. I mean that it is larger. GPT-3.5 is 4k tokens, GPT-4 is 8k tokens (standard) or 32k tokens (only API access atm). This is the number of tokens that GPT-X can take into account when producing a response.

Specifically, I was using this to support the statement "In very near future your IDE will send the whole codebase as context to LLMs." I'm not talking about loops or accuracy.

https://platform.openai.com/docs/models

roflyear · on March 26, 2023

It's true, but there is no indication that GPT can explain larger concepts for you, and negative indication it will be able to do it accurately.

It can't even explain small code to me unless it is something that it has been trained on. Often it gets even simple things wrong, either obviously, or worse, subtly wrong.

sd9 · on March 26, 2023

I agree that this is the part that needs more work, and is most uncertain. Increasing context windows seems like a fairly straightforward computational challenge (albeit potentially expensive). On the other hand, whether or not we can scale current models towards "true understanding" (or similar), is a total unknown atm.

I still think we will get useful things from scaling up current models though. I've already got a lot of value out of Copilot, for instance, and I'm looking forward to the next version based on GPT-4. Recently, I've been using the GPT-3 Copilot to write a lot of pandas/matplotlib code, which is fairly straightforward and repetitive, but as mainly a Java developer, I just don't have the APIs at my fingertips. Copilot helps a lot with this sort of thing.

roflyear · on March 27, 2023

> can scale current models towards "true understanding" (or similar), is a total unknown atm.

Right, but it's no more known than before GPT models IMO. It's the same unknown.

I don't mean to imply these language models are not impressive. They are pretty impressive.