Time Series Forecasting with Graph Transformers

129 points by turntable_pride 2 days ago

cye131 2 days ago

I'm not a fan of this blog post as it tries to pass off a method that's not accepted as a good or standard time series methodology (graph transformers) as though it were a norm. Transformers perform poorly on time series, and graph deep learning performs poorly for tasks that don't have real behaviorial/physical edges (physical space/molecules/social graphs etc), so it's unclear why combining them would produce anything useful for "business applications" of time series like sales forecasting.

For those interested in transformers with time series, I recommend reading this paper: https://arxiv.org/pdf/2205.13504. There is also plenty of other research showing that transformers-based time series models generally underperform much simpler alternatives like boosted trees.

After looking further it seems like this startup is both trying to publish academic research promoting these models as well as selling it to businesses, which seems like a conflict of interest to me.

rusty1s 2 days ago

Hey, one of the authors here—happy to clarify a few things.
> Transformers perform poorly on time series.
That’s not quite the point of our work. The model isn’t about using Transformers for time series per se. Rather, the focus is on how to enrich forecasting models by combining historical sequence data with external information, which is often naturally structured as a graph. This approach enables the model to flexibly incorporate a wide range of useful signals, such as:
* Weather forecasts for a region
* Sales from similar products or related categories
* Data from nearby locations or stations
* More fine-granular recent interactions/activities
* Price changes and promotional campaigns
* Competitor data (e.g., pricing, availability)
* Aggregated regional or market-level statistics
The architecture is modular: we don't default to a Transformer for the past sequence component (and in fact use a simpler architecture). The Graph Transformer/Graph Neural Network then extends the past sequence component by aggregating from additional sources.
> It seems like this startup is both trying to publish academic research promoting these models as well as selling it to businesses which seems like a conflict of interest to me.
That’s a bold claim. All of our academic work is conducted in collaboration with university partners, is peer-reviewed, and has been accepted at top-tier conferences. Sharing blog posts that explain the design decisions behind our models isn’t a conflict of interest—it's part of making our internals more transparent.
- fumeux_fume 2 days ago
  
  Lol, a bold claim. It's a rational assumption that any business publishing "academic work" is selling you the upside while omitting or downplaying the downside.
shirokiba 2 days ago

Would you be so kind as to recommend some resources on modern, promising methods for time series forecasting? I'm starting a position doing this work soon and would like to learn more about it if you'd be willing to share
- srean 2 days ago
  
  Read all the M series of competitions and the papers that come out of those exercises. Read Keogh. Also have a healthy respect and understanding of the traditional methods rather than getting distracted by all that happens to be shiny now.
  
  lamename 2 days ago
  
  Wow a sane person among all the hype. Great to see you!
  
  srean 2 days ago
  
  Lol. Yeah, the hype train blinds.
ethan_smith 2 days ago

Recent work like Informer (AAAI'21) and Autoformer (NeurIPS'21) have shown competitive performance against statistical methods by addressing the quadratic complexity and long-range dependency issues that plagued earlier transformer architectures for time series tasks.
tough 2 days ago

thoughts on TimesFM?
> After looking further it seems like this startup is both trying to publish academic research promoting these models as well as selling it to businesses, which seems like a conflict of interest to me.
is this a general rule of thumb that one should not use the same organization to publish research and pursue commercialization generally?
- orochimaaru 2 days ago
  
  Not really. There is no rule against it. You can have a team that research, publishes, patents and shares the patents with commercial scalers. It’s easier with ML than with manufacturing.

ayongpm 2 days ago

https://dontfuckwithscroll.com/

rusty1s 2 days ago

Forwarded :)

cwmoore 2 days ago

“Here, sign this.”

    accept all cookies

ziofill 2 days ago

I can't stand websites that override scrolling

pealco 2 days ago

Most of my time interacting with this site was spent in developer tools, trying to figure out where the scrolling behavior was coming from. (Couldn't figure it out.) I can't understand why people are still doing this in 2025.
- bestest 2 days ago
  
  Enter this in the console:
  document.body.onwheel = (e) => e.stopPropagation();
- almosthere 2 days ago
  
  Most likely the developer is using a Windows computer.
monkeydust 2 days ago

wow didn't realize that until I saw this comment, now I cant unrealize it and angry
rossant 2 days ago

I came here to say this. Don't mess with my scrollbar. Ever.

curtisszmania 2 days ago

[dead]

loehnsberg 2 days ago

[flagged]

frakt0x90 2 days ago

Prophet is great and we use it for multiple models in production at work. Our industry has tons of weird holidays and seasonality and prophet handles that extremely well.
- drewbitt 2 days ago
  
  We also used it at my previous job. Yes it does handle that well, but it was also simply not as correct as we would have liked (often over adjusting based on seasonality) even with tuning. Prophet was probably the right choice initially though just on how easy it is to set up to get decent results.
tech_ken 2 days ago

This is sales research, and after "CAGR in a GSheet" FB Prophet is what's going to be most recognizable to the widest base of customers.
FWIW seems like the real value add is this relational DB model: https://kumo.ai/research/relational-deep-learning-rdl/ The time-series stuff is them just elaborating the basic model structure a little more to account for time-dependence
melenaboija 2 days ago

For such strong and personal statement I have to ask why.
- Worksheet 2 days ago
  
  If you arrived into, say, London and googled "Best fish and chips" would you believe that the top result gives you the meal that you're after?
  
  gk1 2 days ago
  
  …yes? Feels like there’s some bit of tribal knowledge required to understand your point, but fewer people know it than you think.
  
  randomcarbloke 2 days ago
  
  as a Londoner I want to urge you to rethink your position.
  
  motoxpro 2 days ago
  
  I would believe those are some of the better options and definitely a useful benchmark. 1. How do you go about finding the "absolute" best when you go to a city 2. What does this have to do with the GP's question?
  
  hotstickyballs 2 days ago
  
  Why not? It’s definitely a useful benchmark
esafak 2 days ago

Why? That is what everybody uses. What do you use?
- loehnsberg 2 days ago
  
  L1-regularized autoregressive features, holiday dummies, Fourier terms (if suitable in combination) yield lower test errors, are faster in training, and easier to cross-validate than Prophet.
  
  hotstickyballs 2 days ago
  
  Sounds like prophet with extra steps
  
  esafak 2 days ago
  
  With which library though? Is it fast enough for production?

meindnoch 2 days ago

[flagged]

tech_ken 2 days ago

> If this really worked, you'd be making billions on the stock market
That's kind of a weird thing to say given that the market cap for quantitative finance is well over a billion dollars, and this product clearly seems to be targeting that sector (plus others) as a B2B service provider. Do you think that all those quantitative trading firms are using something other than time-series analytics?
Also, setting aside the issue of whether time-series forecasting is valuable for stock-market trading, it seems like the value add of this product isn't necessarily the improved accuracy of the forecasts, but rather the streamlined ETL -> Feature Engineering -> Model Design process. For most firms (either in quantitative finance or elsewhere) that's the work of a small dedicated team of highly-trained specialists. This seems like it has the potential to greatly reduce the labor requirements for such an organization without a concomitant loss of product quality.
- Vaslo a day ago
  
  I can’t read the original other than what was quoted, but time series will never make you millions in stocks (or millions more than not using it) because time series uses past information to predict a time series, and much of stock pricing is already set via efficient market that those firms have already beat you to.

grimpy 2 days ago

[flagged]

superfrank 2 days ago

I don't think the people who wrote the paper are the same people who built the website.