Training LLMs to generate text with citations via fine-grained rewards

serjester · on Feb 16, 2024

This is fantastic if this starts working on smaller models. We've have solid results with appending the following to our GPT-4 system prompt.

"When possible cite your sources. Use the following custom html format <citation>{document_id}†{node_id}</citation>. Very important!"

chasd00 · on Feb 17, 2024

I done testing here and I have seen hallucinations for all of it. It’s not that reliable.

3abiton · on Feb 16, 2024

The field is racing, at the speed of sound. 3.5 was yesterday.

th0ma5 · on Feb 16, 2024

It is my understanding there is no way to objectively or programmatically tell if any of this stuff is correct or doesn't obfuscate some dire error. These tricks don't give me confidence that we're headed in that direction either.

nerdponx · on Feb 16, 2024

Any LLM/GPT system is fundamentally stochastic, so yes, kind of? But it's hardly a "trick" to come up with a way to make them work the way we want them to work.

An LLM is based on the idea that there is some very complicated function

  y(t) = f(y(t-1), y(t-2), ..., y(t-K))

for any sequence of text tokens t(1), y(2), ... and some huge context size K.

We fit the model by attempting to find an approximation of f() with a low badness score (called "loss"). We also don't usually want the absolute lowest badness score, because there is always some tradeoff between between minimizing loss on the specific test data that we happen to have, and preserving the ability to generalize to content outside of the test data.

The technique here is an improvement to that process of finding a good approximation of f(). The specific technique here is to adjust the loss to mean something more specific than "error rate on predicting the next word in a sequence."

The entire basis of the principle is that the loss function controls what the model learns. If we want it to produce more sentences that look a certain way, we introduce a reward for generating sentences that look that way. If we want to avoid certain kinds of outputs, we introduce a penalty for generating those kinds of outputs. From there, it's just gradient descent. Better outputs produce lower losses, so the model starts producing better outputs, without us having to really know or care about what the model is doing internally.

The technique of RLHF is similar along those lines. We discourage the model to hallucinate by having a human review its output and report when the model hallucinates, so we can impose a penalty for hallucination, and thereby shift the model output in the direction of not-hallucinating after many such rounds of "learning".

Is it a fundamental shift in how LLMs work? No. Could it possibly lead to improvements? Yes.

catlifeonmars · on Feb 17, 2024

Doesn’t that imply that there is some topological structure to the hallucinations? What does that topology look like? How do you know that there are no discontinuous regions or that you are not stuck in a local minima?

nerdponx · on Feb 17, 2024

> Doesn’t that imply that there is some topological structure to the hallucinations? What does that topology look like?

Maybe! I'm not an AI researcher or mathematician, so I don't know if anyone has pursued this idea. The problem might be that any such structure is intractably complicated to describe within the limits of human understanding.

> How do you know that there are no discontinuous regions or that you are not stuck in a local minima?

Are you talking about f() getting stuck in some kind of bad region while generating text, or about the optimization process itself?

The answer is the same in both cases: we don't.

Regarding text generation, we've seen plenty of examples where specific prompts result in pathological output. Although I'm not sure if newer models have that problem.

Regarding optimization, there is absolutely no guarantee that we have found a global minimum. Some interesting research has been done on the "loss landscape" of these giant neural network models, and my understanding is that they are messy and complicated. Keep in mind that the training data is part of the loss function! Finding a local minimum on the training data might just result in overfitting.

catlifeonmars · on Feb 17, 2024

> Some interesting research has been done on the "loss landscape" of these giant neural network models, and my understanding is that they are messy and complicated.

Do you have any recommended reading for this? It sounds like a super interesting area of research.

nerdponx · on Feb 18, 2024

This is the one example I had in mind because of all the pretty pictures: https://arxiv.org/abs/1712.09913

Hao Li, Zheng Xu, Gavin Taylor, Christoph Studer, Tom Goldstein. 2018. "Visualizing the Loss Landscape of Neural Nets".

th0ma5 · on Feb 17, 2024

I mean, doesn't that sweep the problem I'm describing into the loss function definition?

nerdponx · on Feb 17, 2024

Sure. It might be impossible to actually set up a combination of model, training data, loss function / reward procedure, etc. that produces a perfect result. But who ever said it would be perfect?

orbital-decay · on Feb 17, 2024

Yes, in the end it just makes it produce better answers most of the time, it doesn't eliminate hallucinations. Not sure how you expect it to objectively tell the correctness, short of formulating the prompt as axioms and performing formal verification. Which is probably not what you want from it.

skeptrune · on Feb 16, 2024

This is soooo much more exciting than the "put 100M tokens in the context window" idea

bugglebeetle · on Feb 16, 2024

I would say the large context sizes with the ability to reliably cite said context is the best possible outcome.

amelius · on Feb 16, 2024

ELIZA was so much more exciting than the "put 100B artificial neurons on a GPU" idea.

;)

anon-sre-srm · on Feb 17, 2024

I'm concerned innovations in LLMs will silently replace human expertise in unverifiable and bitrot ways ultimately leading to widespread decline in mastery in many fields and possibly the loss of ability to teach, mentor, or maintain incomes of actual subject matter experts gradually to a point where no human SMEs can be found for critical technology or research necessary for organized civilization. Brave New World-Idiocracy with such incapable people in it. I'm betting climate change will also get us with hypercanes and famine affecting billions roughly at the same time.

philipswood · on Feb 17, 2024

On the dangers of reading and writing:

> ... And now, since you are the father of writing, your affection for it has made you describe its effects as the opposite of what they really are. In fact, it will introduce forgetfulness into the soul of those who learn it: they will not practice using their memory because they will put their trust in writing, which is external and depends on signs that belong to others, instead of trying to remember from the inside, completely on their own. You have not discovered a potion for remembering, but for reminding; you provide your students with the appearance of wisdom, not with its reality. Your invention will enable them to hear many things without being properly taught, and they will imagine that they have come to know much while for the most part they will know nothing. And they will be difficult to get along with, since they will merely appear to be wise instead of really being so.”

Note that the criticism is mostly spot on!

I spend some time in an internet forum where memory athletes hang out and you'd be surprised at what the human ability to remember can actually be.

This is from Phaedrus by Plato.

There are a few extra observations in the following text that nicely seems applicable to LLMs

https://conversational-leadership.net/myth-of-thamus-and-the...

reasonabl_human · on Feb 19, 2024

What forum are you referring to?

philipswood · on Feb 20, 2024

https://forum.artofmemory.com/

samatman · on Feb 17, 2024

That escalated quickly.

dmezzetti · on Feb 16, 2024

Very interesting approach.

For those interested in an alternate method that doesn't depend on a LLM, check out this article: https://neuml.hashnode.dev/build-rag-pipelines-with-txtai

Disclaimer: I'm the primary author of txtai.

dang · on Feb 16, 2024

Can you please stop posting so promotionally on HN? If you read https://news.ycombinator.com/newsguidelines.html, you'll see it's against the site guidelines.

dmezzetti · on Feb 16, 2024

You got it and I appreciate you asking kindly.

I will say though that hopefully you'll consider applying that policy equally to all. Because many VC-backed and large companies basically post press releases and they trend without issue.

I'm a single person open-source project. But it's your site, I'll respect your request and not post moving forward.

dang · on Feb 17, 2024

I certainly hope we apply things equally! But there are inevitably cases we miss because we don't see everything that gets posted to HN; we largely rely on users to point us to those.

dmezzetti · on Feb 18, 2024

I'm sure it's a tough challenge. And that's just keeping up with the people who are being open and honest about their product/project associations, never mind the others.

I appreciate all you do in keeping the site up and running along with the dedication to ensuring it has high-quality content.

mountainriver · on Feb 16, 2024

I found what he posted relevant and useful, not sure what the issue is

dang · on Feb 17, 2024

Oh yes, the GP comment was fine in isolation. The issue has to do with posting similar things too often. If an account is using HN primarily for promotion, that's not in the intended spirit of the site. This is in the site guidelines: https://news.ycombinator.com/newsguidelines.html.

bugglebeetle · on Feb 16, 2024

The shortcoming of most RAG-based approaches is the assumption that the question resembles the answer in a way that also jives with the embeddings model. Thus far, I’ve not seen strong evidence (or in my testing) that this is true or works well, but at least citation allows for better assessment. The problem seems to be that we don’t have a good feedback loop for ranking RAG retrieval, as we have for LLMs with things like DPO.

whakim · on Feb 16, 2024

100%. This is why RAG and "classical search" will converge for non-trivial use cases. The folks who are doing RAG well still rely on many tried-and-true tricks of the trade: combining semantic search with keyword-based search, using graphs, doing re-ranking, etc. etc. Yet most discussions of RAG on the internet seem to promise consistently awesome query output by just jamming together some embeddings and an LLM, which doesn't pan out in practice.

jerpint · on Feb 16, 2024

Looks like it’s just RAG? The paper is proposing an alternative to RAG due to its shortcomings

dmezzetti · on Feb 16, 2024

It's RAG and a method that identifies citations from the context.

leobg · on Feb 16, 2024

Are you affiliated with txtai?

dmezzetti · on Feb 16, 2024

Yes, updated with a disclaimer.

FinnKuhn · on Feb 16, 2024

profile description says creator of txtai...