There was a lot of detail and data in here, but it's not very useful to me because all of the comparisons are to things I have no experience with.
There's really only one thing I care about: How does this compare to GPT-4?
I have no use for models that aren't at that level. Even though this almost definitely isn't at that level, it's hard to know how close or far it is from the data presented.
None of the 3B and 7B models are at ChatGPT’s level, let alone GPT-4. The 13B models start doing really interesting things, but you don’t get near ChatGPT results until you move up to the best 30B and 65B models, which require beefier hardware. Nothing out there right now approximates GPT-4.
The big story here for me is that the difference in training set is what makes the difference in quality. There is no secret sauce, the open source architectures do well, provided you give them a large and diverse enough training set. That would mean it is just a matter of pooling resources to train really capable open source models. That makes what RedPajama is doing, compiling the best open dataset, very important for the future of high quality open source LLM’s.
If you want to play around with this yourself you can install oobabooga and figure out what model fits your hardware from the locallama reddit wiki. The llama.cpp 7B and 13B models can be run on CPU if you have enough RAM. I’ve had lots of fun talking to 7B and 13B alpaca and vicuna models running locally.
LLaVA 13B is a great multimodal model that has first class support in oobabooga too.
It's really fun to enable both the whisper extension and the TTS extension and have two-way voice chats with your computer while being able to send it pictures as well. Truly mind bending.
Quantized 30B models run at acceptable speeds on decent hardware and are pretty capable. It's my understanding that the open source community is iterating extremely fast on small model sizes getting the most out of them by pushing the data quality higher and higher, and then they plan to scale up to at least 30B parameter models.
I really can't wait to see the results of that process. In the end you're going to have a 30B model that's totally uncensored and is a mix of Wizard + Vicuna. It's going to be a veryyyy capable model.
I usually even prefer GPT-3.5, as it's faster and much cheaper. GPT-4 is great for the hardcore logical reasoning, but when I want something that knows to turn my lights on and turn the TV to a channel, it's overkill.
It's not even that bad. Core i7-12700K with DDR5 gives me ~1 word per second on llama-30b - that is fast enough for real-time chat, with some patience. And things are even better on M1/M2 Macs.
The critical factor seems to be the ability to fit the whole model in RAM (--mlock option in oobabooga). With Apple's RAM prices most M1/M2 owners probably don't have the 32 GB RAM required to fit a 4bit 30B model.
The bit I liked best was the response examples. Look at those. Clearly not as good as GPT-4 but good enough I feel that for say a scenario where you care about privacy or data provenance this would be a contender.
For example a therapist, a search bot for you diary, a company intranet help bot. Anything where the prompt contains something you don’t want to send to a third party.
Assume a truly competitive model in the Open Source world is still a ways off. These teams and their infrastructure are still in their early days while OpenAI is more at the fine-tuning and polishing stage. The fact that these open teams are able to have something in the same universe in terms of functionality this fast is pretty amazing... but it will take time before there's an artifact that will be a strong competitor.
The pace of the progress the open source models are making is pretty astonishing. The smaller model sizes are cheap to train so there is a lot of iteration by many different teams. People are also combining proven approaches together. Then they're going to nail it and scale it. Will be very interesting to see where we are in 3 months time.
There's a nice chart in the leaked Google memos that compares some of the open models against ChatGPT and Bard so you can get a sense where these models land by comparing them to these.
Open source LLMs might do that, but I very much doubt that those models will be small enough to run even on high-end consumer hardware (like say RTX 3090 or 4090).
The way they'll do it, if they do it at all, is to find a way to squeeze the capability into smaller models and get much faster at executing them. That's where the market forces are.
That's exactly the core of the email that leaked out of Google: it's proving far better to be able to have lots of people iterating quickly (which necessarily means broad access to the necessary hardware) than to rely on massive models and bespoke hardware.
I'd anticipate something along the lines of a breakthrough in guided model shrinking, or some trick in partial model application that lets you radically reduce the number of calculations needed. Otherwise whatever happens isn't as likely to come out of the open source LLM community.
> it's proving far better to be able to have lots of people iterating quickly (which necessarily means broad access to the necessary hardware) than to rely on massive models and bespoke hardware
Very true, but can't Google just wait and take from the open-source-LLM community the findings, then quickly update their models on their huge clusters? It's not like they will lose the top position, already done that.
Yes and no. Some of the optimisation techniques that are being researched at the moment use the output of larger models to fine-tune smaller ones, and that sort of improvement can obviously only be one-way. Same with quantising a model beyond the point where the network is trainable. But anything that helps smaller models run faster without appealing to properties of a bigger model that has to already exist? Absolutely yes.
I’ll admit that this comment struck me as strange too (and the thought it might be generated also crossed my mind) but I try to keep an open mind and after considering it, I think the point about white supremacy is worth considering. Note that the writer didn’t say “you’re a racist” but rather framed it rather neutrally: “This can be seen as a form of white supremacy, as it can exclude voices and perspectives from people who may not have access”.
The question this raises is, who has access to state-of-the-art AI tech, and who does not, and might these groups have a racial dimension? Objectively, surely they do? Don’t lightweight models therefore serve an important democratizing purpose, in that they make this tech accessible to (for instance) people in developing nations who are overwhelmingly black and brown?
Oh I definitely agree that there are multiple levels of AI research that are valuable. Huge supporter of open source, and not meaning to talk down to anyone working on AI projects.
It's just that at the moment I'm finding the open source LLM community hard to contextualize from an outside perspective. Maybe it's because things are moving so fast (probably a good thing).
I just know that personally, I'm not going to be exploring any projects until I know they're near or exceeding GPT-4 performance level. And it's hard to develop an interest in anything else other than GPT-4 when comparison is so tough to begin with.
I'd suggest reading the recently leaked Google memo for some context about why open source LLMs are important (and are disruptive from the perspective of a large company). It gives a good insight into why closed source models like GPT-4 might be overtaken by open source even if they can't directly compete at the moment.
Typical reasons are highly specialised models that are cheap and fast to train, lack of censorship, lack of API and usage restrictions, lightweight variants and so on. The reason there's a lot of excitement right now is indeed how fast the space is moving.
This made me chuckle. How anyone thinks Sun has any interest in anything besides enriching himself is delusional. Pro pump and dumper, thats all. He is an artist at the top of his game, I'll give him that!
VLC is surprisingly terrible for audio. It struggles to read tags and almost never auto-sorts properly, it cuts off the end of tracks, there's no ability to play without gaps, etc.
I use VLC as my primary audio player. While I can't speak to the sorting and tagging issues, I can't say I've ever heard it cut off ends of tracks or play without gaps. Maybe those are related to your sound buffer settings
Gapless playback won't be available until VLC 4. Audio getting cut off at the end has been a known issue for 10 years, and will apparently be fixed in v4, as well. I realize that I could be running the nightly to fix the audio issue, but then I'm just trading for other existing v4 bugs -- and these fixes were only implemented very recently into the nightly, so for me at least, VLC has still been a very subpar audio player.
Also check out the open source work of Helium and FreedomFi, who are working to help individuals easily deploy 5G/LTE offload for major carriers and get paid for data usage.
Please don't do this. Not only is this a violation of the agreement between you and the ISP, but consumer internet pipes were never designed for this type of service. This is only feasible if the internet coming to your house is a business line, which it won't be if you live in a residential area.
Does screen-sharing and letting someone control your screen count as "subletting"? Does inviting someone to join your local Minecraft game (which internally starts a server and exposes it to the internet) count as subletting?
If your ISP sells you a service of X Mbps (and if they want to be more precise, X packets per second and X total data transferred in a month) you should be able to use it for any purpose you want. The purpose or content of said packet don't suddenly make it take more network resources.
If ISP's networks suddenly can't cope because people start using what they've paid for then it's on them and they need to price it accordingly and market it more honestly.
It's not though. Sure what you say may be correct in some amount but it has little to do with the comment you responded to and isn't much more than excuse for why ISPs take advantage of their (largely) monopoly powers in the residential internet space.
The reason you don't get a consistent speed and a guarantee as a residential ISP client isn't because of any of the reasons you mentioned. It's because ISPs can force you to pay for their service at whatever price they charge and no matter how bad it is.
The internet could be out for 8 hours a day, you could get sub-dialup speeds consistently, you simply can't connect to some services for some inexplicable reason, or your packet loss could be so bad that you get kicked from services and pages constantly. Guess what, you are still going to pay for it because what's the alternative? No internet at all or satellite internet that goes out whenever a cloud is in the sky and that alots you 1GB a month at 56kbps for 200USD/month.
No matter what a residential ISP does, they will still get their money and even if you service is complete trash you'll grin and bare it lest you end up without access at all. That's the reason we don't get consistent service with residential plans and it won't change until something happens to break us out of the monopolistic regulatory captured environment we are in.
Yes, the US residential market would definitely benefit from more competition.
The model we have here is to separate the infrastructure from the service such that infrastructure providers lay fibre to homes and businesses and then sell wholesale to ISPs who sell service over the common fibre to consumers. It’s definitely better than the US model as there is competition between ISPs and they therefore have reason to apply pressure to get problems fixed. Infrastructure upgrades are still painfully slow as there is little competitive reason to upgrade the fibre (or fibre / copper VDSL in many places) as all the ISPs have little choice but use the infra provider for a certain area.
Ideally you want more infra, but the cost of building out fibre networks is high, particularly if you’re only selling to 1-in-2 or 1-in-3 properties due to competition. That’s before you get to the politics and legals and lobbying you need to do to succeed in the US. I’m hopeful 5G will compete with broadband and give the providers the kick up the ass they need.
> ISPs should advertise more honestly, but if they started talking about contention ratios in their advertising the average consumer would rapidly lose interest.
In short, they have to lie to get business? Why is that even legal?
I'm surprised by how downvoted you are. It is kind of obvious that it is a violation to me here, since we have a history of people reselling their bandwith in blocks of flats.
People feel entitled to their internet access without consideration of infrastructure costs, especially in North America. They see symmetric multi-GB connections commonly offered in countries such as Japan, South Korea and Singapore, and wonder why it can’t be done in the US.
Population density is why: Singapore is ~8400 people per square kilometre, while the US is a scant 36 per square km. That’s two orders of magnitude difference. Everything else follows from this (high prices, single provider, spotty last mile service, etc.)
Population density is often used as an excuse for this, but it's a weak excuse.
Finland has less population density than USA and manages to solve these issues; NYC has more density than Singapore and still has the same problems with internet access and pricing as the less dense areas of USA.
No, it's not about the population density, the key difference is in the lack of competition.
> People feel entitled to their internet access without consideration of infrastructure costs, especially in North America.
What? People are paying the infrastructure costs through their internet bill. If the ISP is pricing it wrong or is mis-representing what they're selling then it's the ISP's fault and not the customers'. The ISP is free to change prices and/or change their marketing to represent the true nature and capability of the service they're selling.
This is true, but even in US locations with high population density, we still don't have symmetric multi gig connections. Instead, we 1 or 2 choices for a wired ISP: generally cable / DOCSIS and DSL. And around here, PSTN copper is literally rotting on the poles, so DSL is out. Fiber is supposed to be installed "soon" (I estimated 1 to 2 years.)
Ha, I thought you were criticizing Coinbase for now being denominated in a long-term worthless US dollar. Funny what happens once you realize that crypto is real money and fiat is imaginary...
> Ha, I thought you were criticizing Coinbase for now being denominated in a long-term worthless US dollar. Funny what happens once you realize that crypto is real money and fiat is imaginary...
On a long enough timescale all money becomes worthless. The currency of the universe, for instance, is entropy. Eventually the universe will succumb to heat death, at which point there will be no currency left to spend.
Entropy is the only true currency. All other currency is imaginary.
https://bsky.app/profile/rawrmaan.com