Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The 7b model specifically is not quite "ChatGPT-level" though, is it?


According to Meta's benchmarking[0] it is comparable on many metrics. I haven't used it myself so I can't say for sure if that is the case when actually using it.

[0]: https://arxiv.org/pdf/2302.13971.pdf


That's GPT3, not ChatGPT.


I don't understand this topic well, but given premise that GPT3 and ChatGPT are different only that ChatGPT includes RLHF(Reinforcement Learning from Human Feedback), and LLaMA 7b is comparable to GPT3 on a number of metrics, it would follow that if we were to improve LLaMA 7b with RLHF, the 7b model would be similar to ChatGPT. Is that correct?


You're likely right that applying RLHF (+ fine-tuning with instructions) to LLaMA 7b would produce results similar to ChatGPT, but I think you're implying that that would be feasible today.

RLHF requires a large amount of human feedback data and IIRC there's no open data set for that right now.


There's open-assistant.io, which is doing RLHF directly on the open


And they've already collected over 100,000 samples, iirc ChatGPT was trained on something like 30,000 samples, so the open models should already be positioned to succeed.


There are open datasets (see the chatllama harness project and its references). You can of course also cross train it using actual ChatGPT.


Is there something I'm missing? ChatLlama doesn't reference any human feedback datasets.

> You can of course also cross train it using actual ChatGPT.

You mean train it on ChatGPT's output? That's against OpenAI's terms of service.


> You mean train it on ChatGPT's output? That's against OpenAI's terms of service.

Oh no, someone call the internet police.

I'm sure scraping tons and tons of images and web data to train DALLE and GPT and then selling access to that data to others was also against many licenses and terms of services, but OpenAI did those anyway.


None of these AIs were created ethically. At the very least we can make sure these huge models don’t solely belong to monopolistic tech companies and democratize their power.


You’re missing something. Both SHP (https://huggingface.co/datasets/stanfordnlp/SHP) and OpenAssistant datasets are referenced.

And the TOS violation might be the case, the project nevertheless has a mode to use OpenAI in the fine tuning steps.


I’m interested in this as well. Comparatively little attention has been paid to those 7B model results, but they look quite good against 175B GPT-3.

As for ChatGPT, that is GPT-3.5 (same 175B model, but with instruction fine-tuning), plus the RLHF.


GPT 3.5 likely differs from the original GPT 3 by more than instruction fine-tuning. For example, it was probably retrained under Chinchilla scaling laws [1], with a lot more data and maybe a somewhat smaller parameter count.

There are many variants of GPT-3 and GPT-3.5, and based on the performance numbers in Meta’s paper, it looks like they’re comparing against the very first version of GPT-3 from 2020. [2]

[1] https://arxiv.org/abs/2203.15556

[2] https://arxiv.org/abs/2005.14165


There's no overhead introduced for the 'final' model inference, is there?


None of the Meta models are RLHF tuned, as far as I know.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: