The 7b model specifically is not quite "ChatGPT-level" though, is it?

doctoboggan · on March 11, 2023

According to Meta's benchmarking[0] it is comparable on many metrics. I haven't used it myself so I can't say for sure if that is the case when actually using it.

[0]: https://arxiv.org/pdf/2302.13971.pdf

redox99 · on March 11, 2023

That's GPT3, not ChatGPT.

Kelamir · on March 11, 2023

I don't understand this topic well, but given premise that GPT3 and ChatGPT are different only that ChatGPT includes RLHF(Reinforcement Learning from Human Feedback), and LLaMA 7b is comparable to GPT3 on a number of metrics, it would follow that if we were to improve LLaMA 7b with RLHF, the 7b model would be similar to ChatGPT. Is that correct?

popinman322 · on March 11, 2023

You're likely right that applying RLHF (+ fine-tuning with instructions) to LLaMA 7b would produce results similar to ChatGPT, but I think you're implying that that would be feasible today.

RLHF requires a large amount of human feedback data and IIRC there's no open data set for that right now.

inawarminister · on March 11, 2023

There's open-assistant.io, which is doing RLHF directly on the open

Taek · on March 12, 2023

And they've already collected over 100,000 samples, iirc ChatGPT was trained on something like 30,000 samples, so the open models should already be positioned to succeed.

summarity · on March 11, 2023

There are open datasets (see the chatllama harness project and its references). You can of course also cross train it using actual ChatGPT.

popinman322 · on March 11, 2023

Is there something I'm missing? ChatLlama doesn't reference any human feedback datasets.

> You can of course also cross train it using actual ChatGPT.

You mean train it on ChatGPT's output? That's against OpenAI's terms of service.

gkbrk · on March 11, 2023

> You mean train it on ChatGPT's output? That's against OpenAI's terms of service.

Oh no, someone call the internet police.

I'm sure scraping tons and tons of images and web data to train DALLE and GPT and then selling access to that data to others was also against many licenses and terms of services, but OpenAI did those anyway.

jquery · on March 12, 2023

None of these AIs were created ethically. At the very least we can make sure these huge models don’t solely belong to monopolistic tech companies and democratize their power.

summarity · on March 11, 2023

You’re missing something. Both SHP (https://huggingface.co/datasets/stanfordnlp/SHP) and OpenAssistant datasets are referenced.

And the TOS violation might be the case, the project nevertheless has a mode to use OpenAI in the fine tuning steps.

throwaway1851 · on March 11, 2023

I’m interested in this as well. Comparatively little attention has been paid to those 7B model results, but they look quite good against 175B GPT-3.

As for ChatGPT, that is GPT-3.5 (same 175B model, but with instruction fine-tuning), plus the RLHF.

DavidSJ · on March 11, 2023

GPT 3.5 likely differs from the original GPT 3 by more than instruction fine-tuning. For example, it was probably retrained under Chinchilla scaling laws [1], with a lot more data and maybe a somewhat smaller parameter count.

There are many variants of GPT-3 and GPT-3.5, and based on the performance numbers in Meta’s paper, it looks like they’re comparing against the very first version of GPT-3 from 2020. [2]

[1] https://arxiv.org/abs/2203.15556

[2] https://arxiv.org/abs/2005.14165

Nowado · on March 11, 2023

There's no overhead introduced for the 'final' model inference, is there?

toxik · on March 11, 2023

None of the Meta models are RLHF tuned, as far as I know.