Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Individual LLM requests are vanishingly small in terms of environmental impact; inference providers use a lot of batching to do lots of work at once. Furthermore, LLMs and diffusion models are not the only ML workload. While generative AI tickles investors, most of the ML actually being deployed is more mundane things, like recommendation systems, classifiers, and the like; much of which is used for adtech purposes adversarial to that of users. If LLMs and diffusers were the only thing companies used ML for, but efficiency gains from new hardware remained constant, we'd still be at the 2017 baseline for environmental impact of data centers.

Likewise, I doubt that USENET warning was ever true beyond the first few years of the networks' lifetime. Certainly if everything was connected via dial-up, yes, a single message could incur hundreds of dollars of cost when you added the few seconds of line time it took to send up across the whole world. But that's accounting for a lot of Ma Bell markup. Most connections between sites and ISPs on USENET were done through private lines that ran at far faster speeds than what you could shove down copper phone wiring back then.



> Individual LLM requests are vanishingly small in terms of environmental impact;

The article uses open source models to infer cost, because those are the only models you can measure since the organizations that manage them don't share that info. Here's what the article says:

> The largest of our text-generation cohort, Llama 3.1 405B, [...] needed 3,353 joules, or an estimated 6,706 joules total, for each response. That’s enough to carry a person about 400 feet on an e-bike or run the microwave for eight seconds.

I just looked at the last chat conversation I had with an LLM. I got nine responses, about the equivalent of melting the cheese on my burrito if I'm in a rush (ignoring that I'd be turning the microwave on and off over the course of a few hours, making an awful burrito).

How many burritos is that if you multiply it by the number of people who have a similar chat with an LLM every day?

Now that I'm hungry, I just want to agree that LLMs and other client-facing models aren't the only ML workload and aren't even the most relevant ones. As you say adtech has been using classifiers, vector engines, etc. since (anecdotally) as early as 2007. Investing algorithms are another huge one.

Regarding your USENET point, yeah. I remember in 2000 some famous Linux guy freaking out that members of Linuxcare's sales team had a 5 line signature in their emails instead of the RFC-recommended 3 lines because it was wasting the internet or something. It's hard for me to imagine what things were like back then.


If what you're saying is true, why are we hearing about AI companies wanting to build nuclear power plants to power new data centers they think they need to build?

Are you saying all of that new capacity is needed to power non-LLM stuff like classifiers, adtech, etc? That seems unlikely.

Had you said that inference costs are tiny compared to the upfront cost of training the base model, I might have believed it. But even that isn't accurate -- there's a big upfront energy cost to train a model, but once it becomes popular like GPT-4, the inference energy cost over time is dramatically higher than the upfront training cost.

You mentioned batch computing as well, but how does that fit into the picture? I don't see how batching would reduce energy use. Does "doing lots of work at once" somehow reduce the total work / total energy expended?


> If what you're saying is true, why are we hearing about AI companies wanting to build nuclear power plants to power new data centers they think they need to build?

Well, partly because they (all but X, IIRC) have commitments to shift to carbon-neutral energy.

But also, from the article:

> ChatGPT is now estimated to be the fifth-most visited website in the world

That's ChatGPT today. They're looking ahead to 100x-ing (or 1,000,000x-ing) the usage as AI replaces more and more existing work.

I can run Llama 3 on my laptop, and we can measure the energy usage of my laptop--it maxes out at around 0.1 toasters. o3 is presumably a bit more energy intensive, but the reason it's using a lot of power is the >100MM daily users, not that a single user uses a lot of energy for a simple chat.


> not that a single user uses a lot of energy for a simple chat.

This seems like a classic tragedy of the commons, no? An individual has a minor impact, but the rationale switching to LLM tools by the collective will likely have a massive impact.


>If what you're saying is true, why are we hearing about AI companies wanting to build nuclear power plants to power new data centers they think they need to build?

Something to temper this, lots of these AI datacenter projects are being cancelled or put on hiatus because the demand isnt there.

But if someone wants to build a nuke reactor to power their datacenter, awesome. No downsides? We are concerned about energy consumption only because of its impact on the earth in terms of carbon footprint. If its nuclear, the problem has already been solved.


> Something to temper this, lots of these AI datacenter projects are being cancelled or put on hiatus because the demand isnt there.

Wait, any sources for that? Because everywhere I go, there seems to be this hype for more AI data centers. Some fresh air would be nice.


https://www.datacenterdynamics.com/en/news/microsoft-cancels...

AI seems like it is speedrunning all the phases of the hype cycle.

"TD Cowen analysts Michael Elias, Cooper Belanger, and Gregory Williams wrote in the latest research note: “We continue to believe the lease cancellations and deferrals of capacity points to data center oversupply relative to its current demand forecast.”"


Because training costs are sky-high, and handling an individual request still uses a decent amount of energy even if it isn't as horrifying as training. Plus the amount of requests, and content in them, is going up with stuff like vibe coding.

If you want to know more about energy consumption, see this 2 part series that goes into tons of nitty-gritty details: https://blog.giovanh.com/blog/2024/08/18/is-ai-eating-all-th...


The article says 80-90% of data center usage for AI is for inference, and is from a more reputable source than the random blog


The blog is citing specific studies for its claims. Is there an issue with those studies?


It's almost a year old at this point so at best it is horribly out of date




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: