I wonder if one reasons new versions of GPT appear to get better - say at coding tasks is just because they have new knowledge.
When ChatGPT4 comes out, new versions of API’s will have less blog post / examples / documentation in their training data. So ChatGPT 5 comes out and seems to solve all the problems that ChatGPT4 had, but then of course fail on newer libraries. Rinse and repeat
I’ve heard of this idea of training on synthetic data, I wonder what is that data and does this increase or decrease hallucinations? Is the goal of training on synthetic data to better wear certain paths, or to increase the amount of knowledge / types of data.
Because the second seems vaguely impossible to do.
When ChatGPT4 comes out, new versions of API’s will have less blog post / examples / documentation in their training data. So ChatGPT 5 comes out and seems to solve all the problems that ChatGPT4 had, but then of course fail on newer libraries. Rinse and repeat