I think there just isn't enough mathematical data. English language data is magnitudes more and longer then mathematical data so there is a correlation here.
The more data the better the LLM can formulate a realistic model. If it has less data then the resulting output is more of a statistical guess.
There is an argument that can be made here that the more data the more things chatgpt can copy and regurgitate but given how vast the solution space is I think data at best covers less then 1 percent.
Basically I think that if your data covers say 2 percent of the solution space you can generate a better model then if the data covered 1 percent.
The more data the better the LLM can formulate a realistic model. If it has less data then the resulting output is more of a statistical guess.
There is an argument that can be made here that the more data the more things chatgpt can copy and regurgitate but given how vast the solution space is I think data at best covers less then 1 percent.
Basically I think that if your data covers say 2 percent of the solution space you can generate a better model then if the data covered 1 percent.