Measuring the environmental impact of AI inference

156 points by ksec 3 days ago

Research paper: https://services.google.com/fh/files/misc/measuring_the_envi...

Google blog post: https://cloud.google.com/blog/products/infrastructure/measur...

ImaCake 3 days ago

This whole argument would be dead in the water if society had de-carbonised 20 years ago instead of now. This stinks of the personal responsibility fallacy of carbon emissions when the real answer is to do the boring job of making energy production cleaner and doing a better job at moving people around.

mritterhoff 2 days ago

Well said. Moralizing energy consumption is inefficient and no way to run a market. It'd be better to pass regulation that internalized the externalities in the price of electricity so that it captured the societal costs of emissions.
tengbretson 2 days ago

This should be seen as the opportunity of a lifetime. "Invest in infrastructure to power this awesome new technology" is obviously a more compelling story than "Invest in all new infrastructure—the plan is to use it less"
pj_mukh 2 days ago

Bingo.
Playing whack a mole with individual behavior while the elephant in the room is energy production and transportation remains asinine as always.

jjani 3 days ago

Here's what happened:

1. Google rolled our AI summaries on all of their search queries, through some very tiny model 2. Given worldwide search volume, that model now represents more than 50% of all queries if you throw it on a big heap with "intentional" LLM usage 3. Google gets to claim "the median is now 33x lower!", as the median is now that tiny model giving summaries nobody asked for

It's very concerning that this marketing puff piece is being eaten up by HN of all places as evidenced by the other thread.

Google is basing this all of "median" because there's orders of magnitudes difference betwen strong models (what most people think of when you talk AI) and tiny models, which Google uses "most" by virtue of running them for every single google search to produce the summaries. So the "median" will be whatever tiny model they use for those models. Never mind that Gemini 2.5 Pro, which is what everyone here would actually be using, may well consume >100x much.

It's absurdly misleading and rather obvious, but it feels like most are very eager to latch on to this so they can tell themselves their usage and work (for the many here in AI or at Google) is all peachy. I've been reading this place for years and have never before seen such uncritical adoption of an obvious PR piece detached from reality.

raincole 3 days ago

It's not what the report says.
> It's very concerning that this marketing puff piece is being eaten up by HN of all places as evidenced by the other thread.
It's very concerning that you can just make shit up on HN and be the top comment as long as it's to bash Google.
> Never mind that Gemini 2.5 Pro, which is what everyone here would actually be using, may well consume >100x much
Yes, exactly, never mind that. The report is to compare against a data point from May 2024, before Gemini 2.5 Pro became a thing.
- latexr 2 days ago
  
  > make shit up on HN and be the top comment as long as it's to bash Google.
  I don’t think that’s fair. Same would’ve happened if it were Microsoft, or Apple, or Amazon. By now we’re all used to (and tired) of these tech giants lying to us and being generally shitty. Additionally, for decades we haven’t been able to trust reports from big companies which say “everything is fine, really” when they publish it themselves, about themselves, contradicting the general wisdom of something bad they’ve been doing. Put those together and you have the perfect combination; we’re primed to believe they’re trying to deceive us again, because that’s what happens most of the time. It has nothing to do with it being Google, they just happened to be the target this time.
- ksec 2 days ago
  
  >It's very concerning that you can just make shit up on HN and be the top comment as long as it's to bash Google.
  Off topic. I wanted to say somewhat counterintuitively I often upvote / submit things I disagree with and dont downvote it as long as sub comments offer a good counter argument or explanation.
  Sometimes being top just meant that is what most people are thinking, and it being wrong and corrected is precisely why I upvote it and wish it stayed on top so others can learn.
mgraczyk 3 days ago

As others have pointed out, this is false. Google has made their models and hardware more efficient, you can read the linked report. Most of the efficiency comes from quantization, MoE, new attention techniques, and distillation (making smaller models useable in place of bigger models)
- jjani 2 days ago
  
  - The report doesn't name any Gemini models at all, only competitors'. Wonder why that is? If the models got so much more efficient, they'd be eager to show this.
  - The report doesn't name any averages (means), only medians. Why oh why would they be doing this, when all other marketing pieces always use the average because outside of HN 99% of Joes on the street have no idea what a median is/how it differs from the mean? The average is much more relevant here when "measuring the environmental impact of AI inference".
  - The report doesn't define what any of the terms "Gemini Apps", "the Gemini AI assistant" or "Gemini Apps text prompt" concretely mean
  
  jsnell 2 days ago
  
  The report also doesn't define what the word "AI" means. What are they trying to hide?!
  In reality, we know what Google means by the term "Gemini Apps", because it's a term they've had to define for e.g. their privacy policies[0].
  > The Gemini web app available through gemini.google.com and browser sidebars
  > The Gemini mobile apps, which include:
  > The Gemini app, including as your mobile assistant, on Android. Note that Gemini is hosted by the Google app, even if you download the Gemini app.
  > The Gemini app on iOS
  > Gemini in the Google Messages app in specific locations
  > The Gemini in Chrome feature. Learn more about availability.
  That established definition does not include AI summaries (actually AI Overviews) on search like you very claimed. And it's something where Google probably is going to be careful -- the "Gemini Apps" name is awkward, but they need a name that distinguishes these use cases from other AI use cases with different data boundaries / policies / controls.
  If the report was talking about "Gemini apps", your objection might make sense.
  [0] https://support.google.com/gemini/answer/13594961?hl=en
  
  jjani 2 days ago
  
  It's very strange that we'd have to dive into their privacy policy to get a clear definition of it, but good spot.
  The rest stands though - no models, no averages. User tovej below put it better than I did:
  > The median does not move if the upper tail shifts, it only moves if the median moves.
  > The fact that they do not report the mean is concerning. The mean captures the entire distribution and could actually be used to calculate the expected value of energy used.
  > The median only tells you which point separates the upper half from the lower half, if you don't know anything else about the distribution you cannot use it for any kind of analysis
  49% of queries could be costing 1000x that median. Stats 101 combined with a sliver of critical reading reveals this report isn't worth the bytes it's taking up.
  
  scott_w 2 days ago
  
  To be fair, the report explains their reasoning: they state the mean is too sensitive to outliers.
  Now, I do agree it would have been nice to demonstrate this, however it could be genuine.
  
  jjani 2 days ago
  
  That's a complete cop out. They didn't give the data to back this up.
  
  scott_w 2 days ago
  
  They definitely should have shown an example, or referenced something else that backs up their claim. I think it was you who made the good point that, when it comes to data usage, the mean may well give you more meaningful information because of the outliers!
  I can see the median being useful for answering what the cost of one more server/agent/whatever would be, but that’s not what this paper is asking.
- oulipo2 2 days ago
  
  sure, but the issue is if you make the model 30x more efficient, but you use it 300x more often (mostly for stuff nobody wants), it's still a net loss
  
  mgraczyk 2 days ago
  
  Would you say that computers are less efficient now than they were in the 90s because they are more widely used?
  
  jononor 2 days ago
  
  Not less efficient. But the impact on resources usage is still higher. Of course the impact in terms of positive effects is also higher. So the cost/benefit may also have gone up.
shwaj 3 days ago

Are you sure? It wouldn’t shock me, but they specifically say “Gemini Apps”. I wasn’t familiar with the term, but a web search indicated that it has a specific meaning, and it doesn’t seem to me like web search AI summaries would be covered by it. Am I missing something?
user568439 3 days ago

"It's very concerning that this marketing puff piece is being eaten up by HN of all places as evidenced by the other thread."
It's very concerning that you claim this without previously fully reading and understanding Google's publication...
tobr 3 days ago

I’ve dramatically reduced my median calories per meal, by scheduling eight new meals a day, each consisting of one lettuce leaf.
jsnell 2 days ago

I know there's a lot of rebuttals to this statement already, but I think there's a simpler way of showing it is incorrect:
Figure 2 in the paper shows the LMArena score of whatever model is used for "median" Gemini query. That score is consistent with Gemini Flash (probably 2.0, given the numbers are from May), not a "tiny model" used for summaries nobody is asking for.
RajT88 3 days ago

Big tech seems all about the fluff.
But, wasn't it always so?
Wasn't it always so in business of all kinds?
Why should we expect anything different? We should have been skeptical all along.
- camillomiller 3 days ago
  
  I’ve been covering tech for 20 years. No, it wasn’t always like that. There was a sincere mutual respect between the companies and the media industry that I don’t see anymore. Both sides have their fault, but you know it’s not media that huperscaled and created gazillionaires by the score. Also, software is way more bendable to the emperors’ whims, and Google has become particularly hypocritical in the way it publicly represent itself.
  
  rbinv 2 days ago
  
  Agreed. Big tech is trying to become the media industry.
jonas21 3 days ago

What exactly are you basing this assertion on (other than your feelings)? Are you accusing Google of lying when they say in the technical report [1]:
> This impact results from: A 33x reduction in per-prompt energy consumption driven by software efficiencies—including a 23x reduction from model improvements, and a 1.4x reduction from improved machine utilization.
followed by a list of specific improvements they've made?
[1] https://services.google.com/fh/files/misc/measuring_the_envi...
- esperent 3 days ago
  
  Unless marketing blogs from any company specifically say what model they are talking about, we should always assume they're hiding/conflating/mislabeling/misleading in every way possible. This is corporate media literacy 101.
  The burden of proof is on Google here. If they've reduced gemini 2.5 energy use by 33x, they need to state that clearly. Otherwise a we should assume they're fudging the numbers, for example:
  A) they've chosen one particular tiny model for this number
  or
  B) it's a median across all models including the tiny one they use for all search queries
  EDIT: I've read over the report and it's B) as far as I can see
  Without more info, any other reading of this is a failing on the reader's part, or wishful thinking if they want to feel good about their AI usage.
  We should also be ready to change these assumptions if Google or another reputable party does confirm this applies to large models like Gemini 2.5, but should assume the least impressive possible reading until that missing info arrives.
  Even more useful info would be how much electricity Google uses per month, and whether that has gone down or continued to grow in the period following this announcement. Because total energy use across their whole AI product range, including training, is the only number that really matters.
  
  mquander 3 days ago
  
  You should not assume that "they've chosen one particular tiny model", or "it's a median across all models including the tiny one they use for all search queries" because those are totally made up assumptions that have nothing to do with what they say they measured. They measured the Gemini Apps product that completes text prompts. They also provided a chart showing that the thing they are measuring scores comparably to GPT-4o on LM Arena.
  
  penteract 3 days ago
  
  From the report:
  > To calculate the energy consumption for the median Gemini Apps text prompt on a given day, we first determine the average energy/prompt for each model, and then rank these models by their energy/prompt values. We then construct a cumulative distribution of text prompts along this energy-ranked list to identify the model that serves the 50-th percentile prompt.
  They are measuring more than one model. I assume this statement describes how they chose which model to report the LM arena score for, and it's a ridiculous way to do so - the LM arena score calculated this way could change dramatically day-to-day.
  
  mgraczyk 3 days ago
  
  > total energy use across their whole AI product range, including training, is the only number that really matters.
  What if they are serving more requests?
  
  mgraczyk 3 days ago
  
  They did specifically say in the linked report
  
  esperent 3 days ago
  
  Here's the report. Could you tell me where in it you found a link to 33x reduction (or any large reduction) for any specific non-tiny model? Because all I can find is lots of references to "median Gemini". In fact, I would say they're being extremely careful in this paper not to mention any particular Google models with regards to energy reduction.
  https://services.google.com/fh/files/misc/measuring_the_envi...
  
  mgraczyk 3 days ago
  
  Figure 4
  I think you are assuming we are talking about swapping API usage from one model to another. That is not what happened. A specific product doing a specific thing uses less energy now.
  To clarify: the way models become more efficient is usually by training a new one with a new architecture, quantization, etc.
  This is analogous to making a computer more efficient by putting a new CPU in it. It would be completely normal to say that you made the computer more efficient, even though you've actually swapped out the hardware.
  
  sigilis 3 days ago
  
  Don’t they call all their LLM models Gemini? The paper indicates that they specifically used all the AI models to come up with this figure when they describe the methodology. It looks like they even include classification and search models in this estimate.
  I’m inclined to believe that they are issuing a misleading figure here, myself.
  
  mgraczyk 3 days ago
  
  They reuse the word here for a product, not a model. It's the name of a specific product surface. There is no single model and the models used change over time and for different requests
  
  immibis 3 days ago
  
  So it includes both tiny models and large models?
  
  mgraczyk 3 days ago
  
  I would assume so. One important trend is that models have gotten more intelligent for the same size, so for a given product you can use a smaller model.
  Again this is pretty similar to how CPUs have changed
  
  immibis 2 days ago
  
  So it's not a specific product doing a specific thing, but the average across different things?
  
  simianwords 2 days ago
  
  “Gemini App” would be the specific Gemini App in the App Store. Why would it be anything different?
  
  esperent 3 days ago
  
  > Figure 4: Median Gemini Apps text prompt emissions over time—broken down by Scope 2 MB emissions (top) and Scope 1+3 emissions (bottom). Over 12 months, we see that AI model efficiency efforts have led to a 47x reduction in the Scope 2 MB emissions per prompt, and 36x reduction in the Scope 1+3 emissions per user prompt—equivalent to a 44x reduction in total emissions per prompt.
  Again, it's talking about "median Gemini" while being very careful not to name any specific numbers for any specific models.
  
  logicprog 2 days ago
  
  You're grouping those words wrong. As another commenter pointed out to you, which you ignored, it's median (Gemini Apps) not (median Gemini) Apps. Gemini Apps is a highly specific thing — with a legal definition even iirc — that does not include search, and encompasses a list of models you can actually see and know.
  
  esperent 2 days ago
  
  I didn't ignore it, I actually spent some time researching to find out what Google means by "Gemini Apps" (plural) and whether it includes search AI overview, and I can't get a clear answer anywhere.
  Of course, Gemini App (singular) means the mobile app. But it seems that the term Gemini Apps (plural) is being used by Google to refer to any way in which users can access the Gemini models, and also they do clearly state that a version of Gemini isused to generate the search overviews.
  So it still seems reasonably likely, until they confirm otherwise, that this median includes search overview.
  
  simianwords a day ago
  
  "This section presents the environmental impact metrics for the Gemini Apps AI assistant" is this also not specific enough?
  
  esperent a day ago
  
  No, because unless they state otherwise we should assume that they consider search overview to be an AI assistant (they definitely believe this) and also that it's one of the Gemini Apps.
  Look, there's not enough information to answer this within the paper. I'm not willing to give Google the benefit of the doubt on vague language, and you are. I'm assuming they're a huge basicappy evil corporation whose every publication is gone over and reworded by marketing to make them look good, and you're assuming... whatever.
  That's fine by me, we disagree. Let's stop here.
  
  simianwords 2 days ago
  
  What do you think the Gemini app means? It can only mean the consumer facing actually existing Gemini App that exposes 2 models.
  
  esperent a day ago
  
  They refer to Gemini Apps, plural. One of those apps is also called the Gemini App, singular.
  
  mgraczyk 3 days ago
  
  That isn't what that means. Look at the paragraph above that where they explain.
  This is the median model used to serve requests for a specific product surface. It's exactly analogous to upgrading the CPU in a computer over time
  
  tovej 3 days ago
  
  The median does not move if the upper tail shifts, it only moves if the median moves.
  The fact that they do not report the mean is concerning. The mean captures the entire distribution and could actually be used to calculate the expected value of energy used.
  The median only tells you which point separates the upper half from the lower half, if you don't know anything else about the distribution you cannot use it for any kind of analysis.
  
  esperent 3 days ago
  
  I can't copy text from that pdf on my phone, but the paragraph above says exactly what you'd expect: they're using a "median" value from a "typical user" across all Gemini models. While being very careful not to list the specific models which are used to calculate this median, because it almost certainly includes the tiny model used to show AI summaries on google.com, which would massively skew the median value. As someone above said, it's like adding 8 extra meals of a single lettuce leaf and then claiming you reduced the median caloric intake of your meals.
  
  simianwords 2 days ago
  
  This doesn’t check out. It is not reasonable to interpret “Gemini app” as also including a functionality that is embedded in google searches.
  Gemini app is a specific thing: the Gemini App that actually exists.
  How can Gemini App also include their internal augmented functionality on search which itself is not an application?
  
  tupshin 2 days ago
  
  If I, as a regular Google user ask in the search "is this search powered by Gemini?", the AI generated result is in the affirmative.
  "Yes, this search is powered by a customized version of the Gemini model for its generative AI features."
  Based on that, I'm not sure how it is reasonable to claim that Gemini App has a legal term that is exclusive of its use in search.
  Amusingly, it refuses to answer if i ask "is this search powered by Gemini app?"
  
  simianwords 2 days ago
  
  What? The paper clearly says "This section presents the environmental impact metrics for the Gemini Apps AI assistant". You are going through lots of hoops instead of just reading the paper.

kingstnap 3 days ago

If you have a market for it, the hardware industry will aggressively dig in to try to deliver. Maximum performance and maximum efficiency. So I can imagine there is still more to go.

I'm sure the relatively clean directed computational graph + massively parallel + massively hungry workload of AI is a breath of fresh air to the industry.

Hardware gains were for the longest time doing very little for consumers because the bottlenecks were not in the hardware but instead in extremely poorly written software running in very poorly designed layers of abstraction that nothing could be done about.

sbierwagen 3 days ago

The hardware overhang embodied: that early AI will be inefficiently embodied as a blob of differentiable floating point numbers in order to do gradient descent on them, and shortly after be translated into a dramatically simpler and faster form. An AGI that requires a full rack of H100s to run, suddenly appearing on single video game consoles. https://www.lesswrong.com/w/computing-overhang
Fun fact: Deep Blue was a dedicated chess compute cluster that ran on 30 RS/6000 processors and 480 VLSI chips. If the Stockfish chess program existed in 1997 it would have beaten it with a single 486 CPU: https://www.lesswrong.com/posts/75dnjiD8kv2khe9eQ/measuring-...
- parodysbird 2 days ago
  
  AGI is made-up nonsense, so it's better to phrase the hardware overhang (and anything else that is meant to refer to reality) without reference to it.

jillesvangurp 3 days ago

There are two ways to make AI cheaper: make energy cheaper or make AI hardware and algorithms more efficient and use less energy that way. Google is investing in doing both. And that's a good thing.

I actually see growth in energy demand because of AI or other reasons as a positive thing. It's putting pressure on the world to deliver more energy cheaply. And it seems the most popular and straightforward way is through renewables + batteries. The more clean and cheap capacity like that is added, the more marginalized traditional more expensive solutions get.

The framing on this topic can be a bit political. I prefer to look at this through the lens of economics. The simple economic reality is that coal and gas plant construction has been bottle necked for years on a lot of things to the point where only very little of it gets planned and realized. And what little comes online has pretty poor economics. The cost and growth curves for renewables+battery paint a pretty optimistic picture here with traditional generation plateauing for a while (we'll still build more coal/gas plants, not a lot, and they'll be underutilized) and then dropping rapidly second half of the century as cost and availability of alternatives improves and completely steam roll anything that can't keep up. Fossil fuel based generation could be all but gone by the 2060s.

There are lots of issues with regulations, planning, approval, etc for fossil fuel based generation. There are issues with supply chains for things like turbines. Long term access to cooling water (e.g. rivers) is becoming problematic because of climate change. And there are issues with investors voting with their feet and being reluctant to make long term commitments in what could end up being very poor long term investments. A lot of this also impacts nuclear, which while clean remains expensive and hard to deliver. The net result of all this is that investments in new energy capacity are heavily biased towards battery + renewables. It's the only thing that works on short notice. And it's also the cheapest way to add new capacity. Current growth is already 80-90% renewable. It's not even close at this point. We're talking tens/hundreds of GW added annually.

Of course AI is so hungry for energy that there is a temporary increase in usage for coal/gas. That's existing underutilized plants temporarily getting utilized a bit more mainly because they are there and utilizing them a bit more is relatively easy and quick to realize. It's not actually cheaper and future cost reductions will likely come in the form of replacing that capacity with cheaper power generation as soon as that can be delivered.

Jaxan 3 days ago

There is a third way of making AI cheaper: using it less.
We have seen many technologies which have been made so much more efficient (heat pumps, solar panels, etc). Really great achievements. Yet the amount of (fossil) energy we use still grows.
- jillesvangurp 3 days ago
  
  Using less is always an individual choice. But not a realistic one to expect 8 billion+ people to take. That's also why fossil fuel usage is still increasing.
  However, you might be too pessimistic here. Fossil fuel usage is actually widely expected to peak in the next few years and then enter a steady decline.
  Michael Liebreich of Bloomberg NEF did a pretty interesting editorial on this decline a few weeks ago: https://about.bnef.com/insights/clean-energy/liebreich-the-p...
  He uses a simple model with some very basic assumptions (conservative ones) where he shows how short term fossil fuel usage still increases. Mostly this is just market inertia. But then it will start decreasing and then some decades later, it declines all the way to zero with some long tail of hard to shift use cases.
  He uses some very basic assumptions about economic growth continuing to grow by an average of 3%, a base assumption of renewables outgrowing energy demand increases by 3%, etc. You get to a modest fossil fuel decline by 2040, majority renewables powered economy by the 2050s. And virtually no fossil fuel left in the economy by 2065. The years change but the outcome stays the same as long as renewables outgrow demand increase.
  There are lots of buts and ifs here but he's explicitly addressing the kind of pessimism you are voicing here.
  
  Jaxan 3 days ago
  
  I appreciate your reply, thanks!
  About the “individual choice”: it indeed is, unless tech companies make bad choices. Like GitHub recently showed a button “what are my PRs?” When pressed it asked copilot to give you the list of PRs (incomplete btw). But there already exists a page for that! This is just wasteful and we should blame a company for that.
  
  immibis 2 days ago
  
  Or Google running an AI summary on every single search even though you mostly ignore it. There was no need for Google to do that, and it wasn't my choice.
  
  chermi 2 days ago
  
  I'm not saying it's a good reason, but very clearly there was a reason. I'd expect at least part of it trying to retain/capture Google search users and advertise that chatgpt isn't the only game in town. I'd bet without those summaries the lay person would not know Google had their own AI app.
  
  hoyo1s 2 days ago
  
  AI summaries are mostly shit.
  
  keybored a day ago
  
  > Using less is always an individual choice. But not a realistic one to expect 8 billion+ people to take. That's also why fossil fuel usage is still increasing.
  Thanks. This mindset is not always made this explicit.
  That it is an individual-choice is just as true as the claim that it is a choice made by governments, corporations, non-profits, executives, etc. But this atomized fiction is the only one that is given focus. Why?
  You said it yourself: the perspective is not even conducive to making any change! (“not a realistic...”) We can’t expect 8 billion to make atomized decisions for the betterment of the planet.
  But that’s not what people with this mindset want. They want a scapegoat that (conveniently) cannot change. Or they want an excuse to keep doing what they are already doing. Because hey the entities “that are doing it” cannot change in the aggregate.
- hoyo1s 2 days ago
  
  Yes, and I think this is often a problem on the power generation side. The damn cost of energy storage and photovoltaics has been falling rapidly, hydropower and wind power costs are not high at all.
  The key is to take advantage of economies of scale: The cost of renewable energy generation is mainly in the initial investment and equipment.
  As long as you mass-produce enough equipment, the cost of each device will decrease due to economies of scale. However, thermal power generation is different. The cost of thermal power generation is mainly fuel, and the lower limit of the cost is much higher than photovoltaic power generation.
  I don’t understand why so many people are obsessed with using fossil fuels for power generation, as if it is really more efficient... thermal power no longer has a price advantage a few years ago.
- ACCount37 2 days ago
  
  If your "solution" involves an average person being informed of something and then changing his lifestyle, at a personal loss?
  Then you have no solution at all.
  
  Jaxan 2 days ago
  
  People do things at a personal loss all the time, like giving money to charity or unpaid volunteer work.
  And yes, keeping people informed is difficult but a crucial effort for a working democracy.
- temp123098 3 days ago
  
  The average person doesn't care enough about not using fossil fuels to lower his quality of life. If your plan of action is moralizing at them until they do we might as well nuke ourselves back into the stone age for all the effect it will have.
  The benefits of technical solutions is that you get the desired effect without any real trade-offs. I don't really care if I use a boiler or a heat pump to heat my house, because the end goal is to heat my house. I don't really care if I use an electric car or dead dinosaurs car, I just want to get places.
  Make the efficient, more climate-friendly alternative a better deal and most people will switch. Tell people that they should give up their cars and AC because the planet will be 3C warmer in 100 years and you'll get an eye-roll. If you want the more environmentally-friendly but also more expensive option to win then the only real option is government subsidies, not preaching - enlightened self-interest trumps all.
  
  Jaxan 3 days ago
  
  I do not agree with this perspective. A lot of people care not only about their own quality of life. But also the life of their peers, children and even people they don’t know. Many people make sacrifices to help others and the planet. It’s only a recent (western) idea that we can just sit back and only care about our own quality of life.
  
  temp123098 3 days ago
  
  But do those hypothetical people care enough to make some actual sacrifices for those strangers?
  For most people, replacing your car with an electric one isn't a big deal. Replacing a car with public transportation is either impossible (living in the boonies), incredibly difficult (suburbia) or merely very annoying (city).
  I very much doubt the average person is willing to give up his car for some nebulous greater good of some strangers half a world away, especially when he hears of Jeff Bozos of this world shutting down half of Venice for a wedding so 50 private jets can ferry fellow fat cats to have a good time. But you, Joe Schmo, ought to use paper straws, sit in 30C room in the summer and sit at home instead of traveling for vacations. To save the planet.
  The situation isn't much different in non-Western countries. Over the last few years China did more for electrification from renewable sources than the rest of the world combined, and yet they're also building a lot of coal power plants because that's what they have so that's what they'll use, damn everybody else. India isn't going to willingly stay poor so that ivory tower elites can feel good about themselves. Countries with oil reserves, majorly non-western, certainly aren't going to not extract it for the good of the planet.
  
  immibis 2 days ago
  
  FYI in some cities, replacing a car with public transport is an improvement. Don't have to find parking. There isn't enough parking for everyone in any city that didn't massively overbuild parking. Cars are physically huge.
  Also don't have to be sober to go home from the bar. I'm convinced ubiquitous public transport (especially on Friday and Saturday all night) informed German drinking culture.
  Similarly you can go from point A to B to C to D to A without having to go back to B to get your big metal box and drag it to D. Exploring the city is way easier. If you've never experienced the freedom of walking around a city designed to be walked around... you should, that's a pretty basic life experience and it's weird how the US government has blocked it from you.
  
  myaccountonhn 3 days ago
  
  This argument is so strange because I don't really know anyone who actively cuts down their own emissions and at the same time think its fine that billionaires fly private jets everywhere. They're the first ones to also push for billionaires to be responsible.
  
  Jaxan 3 days ago
  
  I know multiple people who replaced their car with a cargo bicycle (I’m biased, because I live in the Netherlands).

zekrioca 3 days ago

Measurements for water consumption seems cherry-picked and incorrect to look better than they actually are. When asked about it, they doubled-down and incorrectly mentioned the study in question (to which they compared against) was incorrect. See https://www.linkedin.com/posts/shaolei-ren-68557415_today-go...

textlapse 3 days ago

What’s the cost of training vs inference?

If it’s like Marvel sequels every year then there is a significant added training cost as the expectations get higher and higher to churn out better models every year like clockwork.

oulipo2 2 days ago

Perhaps it's 33x less carbon-intensive, but if we do 200x more queries than a year ago, it's a net loss...

energy123 3 days ago

Cost/prompt is a ratio. "Prompt" is not a normalized metric that is stable over time. It can increase (as context lengths increase) or decrease (as google's product suite integrates llms).

lalaithion 3 days ago

They didn’t account for training. From the paper:

> LLM training & data storage: This study specifically considers the inference and serving energy consumption of an Al prompt. We leave the measurement of Al model training to future work.

This is disappointing, and no analysis is complete without attempting to account for training, including training runs that were never deployed. I’m worried these numbers would be significantly worse and that’s why we don’t have them.

sbierwagen 3 days ago

If I download a copy of llama and run a single query, what was the cost of that query?
- progval 3 days ago
  
  No, because you don't incentivize the training of the next version of LLama, and the current version was not trained because you wanted to run that query.
  This is not true of Gemini.

pingou 3 days ago

Aren't most search queries duplicates? Then after a while they don't even need AI for those duplicates, unless they feed it some different context specific to each user.

esseph 3 days ago

They do a lookup every time

drakenot 3 days ago

This from quantizing their Gemini model?

There are a lot of anecdotal reports of quality differences following some Gemini 2.5 Pro releases earlier in the year.

ant6n 3 days ago

I for one think that Gemini 2.5 pro has become much more stupid than before. This isn’t for coding, just simple business type support. It keeps forgetting queries, making really obviously bad suggestions, simple mistakes etc etc.
It’s kind of funny, because they keep talking about how close we are to AGI, and in reality they keep making the models dumber (uh, I mean more efficient).

ChrisArchitect 3 days ago

[dupe] https://news.ycombinator.com/item?id=44972808

simianwords 3 days ago

Why can’t google give us average instead of median? This is strange.

theanonymousone 3 days ago

Finally someone using drop not in a teenage slang sense.

benreesman 3 days ago

I'm on record as pretty stridently anti-AI Hype Bullshit (I was calling Altman a criminal back when that had real-world consequences, check the history).

But this is in the vanishing minority of frontpage AI threads where it's a really interesting concersation about quantifiable things: what quantization, what engagement metrics, what NDGC on downstream IR. People are complaining they gamed the number: that's an improvement! Normally they just lie. This is amenable to analysis and frankly an interesting one.

If it were up to me they'd flat regex ban "llm" and "ai" on HN, thats about the right ROC. But if we're going to have it? I'll take this over "How AI Saved My Vibecode Startup From Vibe Coding".

motorest 3 days ago

> People are complaining they gamed the number: that's an improvement!
Is it, though?
There's a post in this discussion claiming that Google rolled out AI summaries on all of their search queries. This means they greatly increased the number of queries by triggering queries at each Google search. These are unsolicited queries that users do not send by themselves or want.
Then the post claims each of these unsolicited queries are executed using small models that are cheaper to run.
The post asserts these unsolicited queries represent half of the queries.
Google's claims are that now the median cost of their queries is lower. The post asserts around half of Google's AI queries are not requested by users and instead forced upon them with searches.
To me, what this spells is the exact opposite of a improvement. It's waste that is not requested by anyone and adds no value. It's just waste.
Consequently, if Google pulled the plug on these queries then the would reduce their total query count by around 50%. How much energy and carbon emissions would that save? Well, if you pick up that value and flip it over to show how much is being wasted, that's your "improvement".

jbrooks84 2 days ago

I can tell, google AI is absolute shit compared to others. Start consuming more power Google.

philberto 3 days ago

How do you drop something by 33x? That is literally impossible unless they make money by purchasing energy.

mgraczyk 3 days ago

No it isn't
Suppose you were running a computation that requires doing 33,000 multiplies. Later you find a way to do the same computation using only 1,000 multiples
That's basically what happened here
- sameermanek 3 days ago
  
  [flagged]
  
  quantummagic 3 days ago
  
  You've been here 15 years and made hardly any comments. Why do you feel so strongly about this now? Wasn't there a way you could have at least made it more constructive?
- playforclaude 3 days ago
  
  33,000 multiples - (33 * 33,000 multiples) = -1056000 multiples
  
  mgraczyk 3 days ago
  
  Reducing something 33x means to make it 33 times smaller. It's a common way of saying this in English
  
  playforclaude 2 days ago
  
  How do you make something "33 times smaller"? Maybe break it down, starting with making something 1 time smaller, then 2 times smaller, and we can see where it goes.
  
  mgraczyk 2 days ago
  
  2x smaller is 50%, 3x smaller is 33%, etc
  It's an extremely common phrase
  
  playforclaude a day ago
  
  What's 1x smaller?
  
  mgraczyk a day ago
  
  It means the same size, you wouldn't say that
  
  globnomulous 3 days ago
  
  "Reduce by 97%" is simple, clear, and accurate.
  "Reduce 33x" and "make 33x smaller" are ambiguous, unclear, and inaccurate. Is something that's "33x smaller" or "reduced 33x" 1/33 of the original total or is it 1/34? The question can't be answered in the absence of more information.
  These are common expressions, sure. They're also awful, belonging to the same category of error as:
  * The price is expensive
  * It's a good-quality piece
  * All but one of my friends speaks like this.
  * Here's an author whom we know cares about language.
  * As well, this is how some people write.
  That is, they're the errors of a normal native speaker.
  
  mgraczyk 3 days ago
  
  It's 1/33, I don't see how it could be 1/34. Nobody would ever make that mistake and it wouldn't matter, close enough
  
  globnomulous a day ago
  
  I gather you don't particularly care (that's essentially your point), but in case you really do want to know how it could be 1/34, and why some weirdo would insist that it does or can, I wrote up the following. :)
  Is "1x smaller" equivalent to "1x larger?" If it is, then 'it's 1/33' and "2x larger/more" means the same thing as "double the size/amount." But if you have two times more than I have, then you have what I have, plus 2x that amount. So you don't have two times as much as I have. You have three times as much. "2x larger," to my ear, clearly does not mean the same thing as "2x as much." "2x larger" should mean "3x as much." That's why "33x smaller" can be read as "1 part of 34."
  When we're even stricter with sense, the expression "33x smaller" becomes completely incoherent, because 1x should represent the original quantity. A 33x reduction should give us a result of -32x.
  Obviously that's not what the article means. It's what the words mean, though, when you read them literally mean, rather than reading past their literal meaning to the intentions of the speaker/writer.
  Most people don't care whether someone means one thing or the other, because, as you wrote, it's close enough to give the general idea.
  The problem that fussy people like me and the commenter above me have is that we want people to say what they mean. And I'd wager that most of us fussy people have to do more mental work in order to get to the result that other people reach intuitively. Having to ignore literal sense in order to read someone's intended meaning is harder for us/me than it is for most people. That's our/my problem. As a matter of sociolinguistics and pragmatics, we're wrong, because literal meaning takes a back seat to idiomatic usage. (It probably does even in this comment that I'm writing.)
  That's why I said these are the errors of a normal, native speaker.
  
  mgraczyk a day ago
  
  Sorry but your explanation is not self consistent. It works for "2x more" but not "2x larger". Those are two different words that mean two different things
dragonwriter 2 days ago

> How do you drop something by 33x?
Because people can't handle fractions, they say something was made "33x smaller" instead of "1/33 as large".
It makes no sense by the individual meaning of the words, it is just a common (and for many people annoying) idiom.