Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

>I don't know why you think language models are fundamentally unable to deduce the knowledge of the points you mention.

Because the knowledge is not there in the text, the models are not able to represent it, and as seen in the demonstration above, they don't have it.



The demonstration is irrelevant. The issue isn't what GPT-3 can or cannot do, but what this class of models can do.

Reduce knowledge to particular kinds of information. Gradient descent discovers information by finding parameters that correspond to the test criteria. Given a large enough data set that is sufficiently descriptive of the world, the "shape" of the world described by the data admits better and worse structures to predict the data. The organizing and association of information that we call knowledge is a part of the parameter space of LLMs. There is no reason to think such a learning process cannot find this parameter space.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: