smartmic 19 days ago

I wonder if GenCast's 15 day forecast is really the right indicator for forecasting? I could imagine that such long term ML forecasts tend to get closer to yearly averages, which are kind of "washed out", but of course look good for benchmarking and marketing reasons. But they are not so practical for the majority of weather forecast users. In short, it still smells a bit like AI snake oil to me. [1]

[1] more about this: https://press.princeton.edu/books/hardcover/9780691249131/ai...

  • aardvarkr 18 days ago

    If that was the only thing it was trained on that could be the case but they’re predicting everything and being graded against more than just their success 15 days out. I think that’s just one of the flashier value adds for weather-dependent businesses like wind production that they can have vs traditional models

Reubend 19 days ago

Although it's great to see these advancements, I would like to see it integrated with the Google Weather results that show up in search and on Android devices before I get excited. Spinning up the model on my own hardware and feeding it data manually is a decent amount of work, and I'm too lazy to do that.

  • scellus 19 days ago

    It's a medium-range global model, while Google Weather (which I don't have) is mostly about local short-range weather? But Google Weather is already based on an AI prediction on most cities: https://research.google/blog/metnet-3-a-state-of-the-art-neu...

    Google says GenCast forecasts will later be available from them too.

    Also ECMWF runs a very similar diffusion model, it's not operational but run a couple of times a day with results available on their graph site (and as data files too I guess): https://charts.ecmwf.int/

dehrmann 19 days ago

I'm not surprised. This is the sort of problem machine learning is really good at solving. There's a lot of quality training data, and the results are governed by physics.

  • singhrac 19 days ago

    That's not really 100% true. A lot of the data this is trained on is ERA5, which appears to be highly dense in both time and space, but is assimilation data inferred from much sparser observations. I wouldn't say it's inaccurate, but I see pretty large deviations between assimilation datasets and private weather observations (I work on this problem).

    The results are governed by physics up to some level, but we can't simulate at that fine a level, so there's some inherent aleatoric uncertainty (i.e. noise). And I would generally say that physics-simulation-ML is not moving as fast as say, inference on images or text. For example, if you see a picture of a car, there's very little inherent uncertainty on what the answer is. If you see the world simulator state there's a lot of uncertainty on what happens next.

    That all being said, I think is basically the best model out there , and almost certainly the best open model. This is really the culmination of many years of effort getting data and software in place to run such a large scale training job. Very impressive!

    • wenc 19 days ago

      > For example, if you see a picture of a car, there's very little inherent uncertainty on what the answer is. If you see the world simulator state there's a lot of uncertainty on what happens next.

      I've been thinking about this a lot. Many ML people work with what is "closed-domain" data -- the data is essentially complete (image, sound, words, or any kind of embeddings) with no unmeasured variables, so the ML algorithm is essentially trying to learn a function that can predict this.

      Unfortunately a lot of "open-domain" data has tons of unmeasured variables that are contextual. Suppose I were to try to predict how a full a parking lot would be over the course of a week. You can gather lots and lots of data, but still never get to a near-perfect level of accuracy because the co-variates that drive how full a parking lot is (unexpected influencer effects on the demand, competitive forces that happen to shift one day, power outages in the other part of town, other irreducible randomness = "aleatoric uncertainty" in technical parlance) aren't in the data (or at least not completely).

      Fortunately this isn't a problem in real life because many effects cancel each other out, so we are able to arrive at a good-enough aggregate prediction. But "open-domain" ML problems will never achieve the kind of accuracy that "closed-domain" ML can achieve, even with tons of data. Closed domain ML can assume a degree of regularity that open domain ML can never assume.

    • qeternity 19 days ago

      > And I would generally say that physics-simulation-ML is not moving as fast as say, inference on images or text.

      I’m not sure that a blanket statement like this is a valid argument in an article that perhaps suggests the opposite is true.

    • dehrmann 19 days ago

      > if you see a picture of a car, there's very little inherent uncertainty on what the answer is

      Unless its a captcha.

amelius 19 days ago

Too bad climate change is happening now, so the model will have to extrapolate.

  • akira2501 19 days ago

    Forecast models predict 7 to 14 days ahead. They have not and never will be able to predict further than that. We don't even need them to.

    • aithrowawaycomm 19 days ago

      The point is that the underlying physics of classical models should still hold true as the climate changes - and where they don't hold, we will have causal insight as to why. But this isn't necessarily true for an ML model: as climate changes and develops new patterns, there is no reason to think ML will adapt to it (and a ton of reasons to guess it won't).

    • achrono 19 days ago

      > We don't even need them to.

      Why not? What's so special about 7-14 days? I can see plenty of reasons one might want to predict accurately the weather even a year out, just one example: will the weather support an outdoor bbq event this late in fall next year?

      • akira2501 19 days ago

        > will the weather support an outdoor bbq event this late in fall next year?

        Sure, but there could be a disease outbreak, or a pork meat recall that prevents it anyways. The weather, as a factor, is generally insignificant with respect to loss of life events. To the extent they are we can see those events far enough in advance at 7 to 14 days to compensate correctly.

        A better example would be "can we launch a space craft from this launch site next year." Even then, we never pick a launch day, we establish a launch window, because we already know, the weather changes fast enough that it's unlikely to remain identical for several days in a row. So conditions on a single day are effectively meaningless.

        The Space Shuttle program got even better with this by feeding in wind data from high atmosphere probes back into the launch vehicle software so it could plan it's maneuvers ahead of the wind so it could reduce vehicle stresses to within tolerable parameters. They went from a 20% launch probability to an 80% launch probability with this system.

        I mean.. enjoy your BBQ either way just bring some popup tents.

    • otterley 18 days ago

      Farmers definitely want long range weather predictions so they can better plan what crops to plant and avoid crop failures. Crop failures due to weather have caused devastating losses.

    • unsupp0rted 19 days ago

      Never will? I wouldn't be surprised if predicting climate + weather 12 months out is a simpler problem than most medical problems at which AI is currently being pointed.

      • JumpCrisscross 19 days ago

        > wouldn't be surprised if predicting climate + weather 12 months out is a simpler problem than most medical problems at which AI is currently being pointed

        Simple systems can be famously unpredictable [1]. Our bodies manage entropy; that should make them complex but predictable. The weather, on the other hand, has no governors or raison d'être.

        [1] https://en.wikipedia.org/wiki/Three-body_problem

        • tomjakubowski 19 days ago

          The three body problem lacks a closed form solution. How does that mean it's unpredictable, though? I thought that numerical methods can be used to make n-body predictions to arbitrary precision. Are these simulations less accurate than I am thinking? How do engineers and scientists working on space probes plan their trajectories and such?

          • JumpCrisscross 19 days ago

            > numerical methods can be used to make n-body predictions to arbitrary precision

            Arbitrary precision, not arbitrary length. Even "from [a] mathematical viewpoint, given an exact initial condition, we can gain mathematically reliable trajectories of chaotic dynamic systems" to only a "finite...interval" [1]. (This is due to "numerical noises, i.e. truncation and round-off error, where truncation error is determined by numerical algorithms and round-off error is due to the limited precision of numerical data, respectively.")

            For a physical system like the weather, uncertainty "mainly comes from limited precision of measurement," though there is also the "inherently uncertain/random property of nature, caused by such as thermal fluctuation, wave-particle duality of de Broglie’s wave, and so on."

            [1] https://www.sciencedirect.com/science/article/abs/pii/S10075...

            • echoangle 19 days ago

              Finite interval doesn’t mean it can’t be arbitrary. I’m not saying it can be in this example but your counterpoint doesn’t follow from the quote.

              For example, I can calculate the Fibonacci sequence to an arbitrary length but not infinite.

              • JumpCrisscross 19 days ago

                > Finite interval doesn’t mean it can’t be arbitrary

                Skim the paper. Numerical noise means you cannot calculate the 3-body problem to an arbitrary length. There is a finite, mathematical limit even with perfect knowledge of initial conditions.

                • echoangle 19 days ago

                  Isn’t the paper about the uncertainties that inherently exist with physical systems?

                  There isn’t any claim that mathematically exact starting values can’t be propagated with arbitrary precision to arbitrary length, and I would claim that this is possible (but not practical due to compute being limited, of course).

                  But there’s no hard limit of precision and length where a simulation can’t be made if the starting conditions are exact. The point of the paper is that starting conditions are never exact which limits the length you can propagate.

                  • JumpCrisscross 19 days ago

                    > Isn’t the paper about the uncertainties that inherently exist with physical systems?

                    It talks about that. Which is relevant when we're talking about the weather. But it opens by discussing the hard mathematical limits to numerical methods.

                    > there’s no hard limit of precision and length where a simulation can’t be made if the starting conditions are exact

                    Wrong.

                    Read. The. Paper. Numerical methods for chaotic systems are inherently, mathematically uncertain.

                    Beyond a certain number of steps, adding precision doesn't yield a more precise answer, it just produces a different one. At a certain point, the difference between the different answers you get with more precision covers the entire solution space.

          • strls 19 days ago

            You can solve to arbitrary precision but you can't measure and specify initial conditions to arbitrary precision, making the solution wrong outside of a small time interval.

            • JumpCrisscross 19 days ago

              > can solve to arbitrary precision but you can't measure and specify initial conditions to arbitrary precision

              Even with perfect knowledge of initial conditions, numerical noise limits the forecast interval.

      • wongarsu 19 days ago

        Predicting that next December will be cold, sure. Predicting how cold or how much rainfall or snowfall there will be in next December would be difficult, but you could get in the right ballpark. Predicting in which week it will snow a year from now? Not a chance.

      • akira2501 19 days ago

        In order to predict the climate you need to predict volcanic eruptions. Good luck.

    • zsims 19 days ago

      We do, I want to know if it's going to rain on my birthday

rvnx 19 days ago

GenCast was trained on data from 1959-2023, so no surprise it can "predict" back 2019.

It's like the super trading algorithms who achieve perfect scores during backtest.

The question is, how does it perform on unknown events.

  • 1727706962 19 days ago

    Looks like it was a forward prediction.

    From the linked article:

    > GenCast is a machine learning weather prediction model trained on weather data from 1979 to 2018

    and a google blog https://deepmind.google/discover/blog/gencast-predicts-weath...

    > To rigorously evaluate GenCast's performance, we trained it on historical weather data up to 2018, and tested it on data from 2019

    • pfisherman 19 days ago

      This is still on retrospective data. The machine learning graveyard is filled with models that worked well on retrospective data, but did not hold up in a live inference setting. Just ask Zillow. The real test is whether they can predict the weather 14 days out in 2025.

      I am guessing they did not want to set up the data pipeline to run inference in a live setting. But that is what I would need to see to be a true believer.

      Still a cool result and article though.

      • scellus 19 days ago

        ECMWF runs many such models at their site, a run two or four times per day, and they have verification statistics too, no need to doubt the accuracy.

        The Google model is probably the best so far but ECMWF's own diffusion model was already on par with ENS and many point-forecast models (graph transformers, not diffusion) outperform state-of-the-art physical models.

        What is missing is initialization directly from observations. All the best-performing models initialize from ERA5 or other reconstruction.

  • akira2501 19 days ago

    > One caveat is that GenCast tested itself against an older version of ENS, which now operates at a higher resolution. The peer-reviewed research compares GenCast predictions to ENS forecasts for 2019, seeing how close each model got to real-world conditions that year.

    And GenCast was tested against and older model which performs worse.

    > The ENS system has improved significantly since 2019, according to ECMWF machine learning coordinator Matt Chantry. That makes it difficult to say how well GenCast might perform against ENS today.

    And the testing makes it "difficult to say." The obvious conclusion is "run a new set of tests" but they'd rather pay of the verge to publish half truths instead.

    • scellus 19 days ago

      But ECMWF itself runs a diffusion model that is practically on par with ENS in accuracy. They also seem to collaborate closely.