I'm not a fan of this blog post as it tries to pass off a method that's not accepted as a good or standard time series methodology (graph transformers) as though it were a norm. Transformers perform poorly on time series, and graph deep learning performs poorly for tasks that don't have real behaviorial/physical edges (physical space/molecules/social graphs etc), so it's unclear why combining them would produce anything useful for "business applications" of time series like sales forecasting.
For those interested in transformers with time series, I recommend reading this paper: https://arxiv.org/pdf/2205.13504. There is also plenty of other research showing that transformers-based time series models generally underperform much simpler alternatives like boosted trees.
After looking further it seems like this startup is both trying to publish academic research promoting these models as well as selling it to businesses, which seems like a conflict of interest to me.
Hey, one of the authors here—happy to clarify a few things.
> Transformers perform poorly on time series.
That’s not quite the point of our work. The model isn’t about using Transformers for time series per se. Rather, the focus is on how to enrich forecasting models by combining historical sequence data with external information, which is often naturally structured as a graph. This approach enables the model to flexibly incorporate a wide range of useful signals, such as:
* Weather forecasts for a region
* Sales from similar products or related categories
* Data from nearby locations or stations
* More fine-granular recent interactions/activities
* Price changes and promotional campaigns
* Competitor data (e.g., pricing, availability)
* Aggregated regional or market-level statistics
The architecture is modular: we don't default to a Transformer for the past sequence component (and in fact use a simpler architecture). The Graph Transformer/Graph Neural Network then extends the past sequence component by aggregating from additional sources.
> It seems like this startup is both trying to publish academic research promoting these models as well as selling it to businesses which seems like a conflict of interest to me.
That’s a bold claim. All of our academic work is conducted in collaboration with university partners, is peer-reviewed, and has been accepted at top-tier conferences. Sharing blog posts that explain the design decisions behind our models isn’t a conflict of interest—it's part of making our internals more transparent.
Lol, a bold claim. It's a rational assumption that any business publishing "academic work" is selling you the upside while omitting or downplaying the downside.
Would you be so kind as to recommend some resources on modern, promising methods for time series forecasting? I'm starting a position doing this work soon and would like to learn more about it if you'd be willing to share
Read all the M series of competitions and the papers that come out of those exercises. Read Keogh. Also have a healthy respect and understanding of the traditional methods rather than getting distracted by all that happens to be shiny now.
Recent work like Informer (AAAI'21) and Autoformer (NeurIPS'21) have shown competitive performance against statistical methods by addressing the quadratic complexity and long-range dependency issues that plagued earlier transformer architectures for time series tasks.
> After looking further it seems like this startup is both trying to publish academic research promoting these models as well as selling it to businesses, which seems like a conflict of interest to me.
is this a general rule of thumb that one should not use the same organization to publish research and pursue commercialization generally?
Not really. There is no rule against it. You can have a team that research, publishes, patents and shares the patents with commercial scalers. It’s easier with ML than with manufacturing.
Most of my time interacting with this site was spent in developer tools, trying to figure out where the scrolling behavior was coming from. (Couldn't figure it out.) I can't understand why people are still doing this in 2025.
Prophet is great and we use it for multiple models in production at work. Our industry has tons of weird holidays and seasonality and prophet handles that extremely well.
We also used it at my previous job. Yes it does handle that well, but it was also simply not as correct as we would have liked (often over adjusting based on seasonality) even with tuning. Prophet was probably the right choice initially though just on how easy it is to set up to get decent results.
This is sales research, and after "CAGR in a GSheet" FB Prophet is what's going to be most recognizable to the widest base of customers.
FWIW seems like the real value add is this relational DB model: https://kumo.ai/research/relational-deep-learning-rdl/ The time-series stuff is them just elaborating the basic model structure a little more to account for time-dependence
I would believe those are some of the better options and definitely a useful benchmark.
1. How do you go about finding the "absolute" best when you go to a city
2. What does this have to do with the GP's question?
L1-regularized autoregressive features, holiday dummies, Fourier terms (if suitable in combination) yield lower test errors, are faster in training, and easier to cross-validate than Prophet.
> If this really worked, you'd be making billions on the stock market
That's kind of a weird thing to say given that the market cap for quantitative finance is well over a billion dollars, and this product clearly seems to be targeting that sector (plus others) as a B2B service provider. Do you think that all those quantitative trading firms are using something other than time-series analytics?
Also, setting aside the issue of whether time-series forecasting is valuable for stock-market trading, it seems like the value add of this product isn't necessarily the improved accuracy of the forecasts, but rather the streamlined ETL -> Feature Engineering -> Model Design process. For most firms (either in quantitative finance or elsewhere) that's the work of a small dedicated team of highly-trained specialists. This seems like it has the potential to greatly reduce the labor requirements for such an organization without a concomitant loss of product quality.
I can’t read the original other than what was quoted, but time series will never make you millions in stocks (or millions more than not using it) because time series uses past information to predict a time series, and much of stock pricing is already set via efficient market that those firms have already beat you to.
I'm not a fan of this blog post as it tries to pass off a method that's not accepted as a good or standard time series methodology (graph transformers) as though it were a norm. Transformers perform poorly on time series, and graph deep learning performs poorly for tasks that don't have real behaviorial/physical edges (physical space/molecules/social graphs etc), so it's unclear why combining them would produce anything useful for "business applications" of time series like sales forecasting.
For those interested in transformers with time series, I recommend reading this paper: https://arxiv.org/pdf/2205.13504. There is also plenty of other research showing that transformers-based time series models generally underperform much simpler alternatives like boosted trees.
After looking further it seems like this startup is both trying to publish academic research promoting these models as well as selling it to businesses, which seems like a conflict of interest to me.
Hey, one of the authors here—happy to clarify a few things.
> Transformers perform poorly on time series.
That’s not quite the point of our work. The model isn’t about using Transformers for time series per se. Rather, the focus is on how to enrich forecasting models by combining historical sequence data with external information, which is often naturally structured as a graph. This approach enables the model to flexibly incorporate a wide range of useful signals, such as:
* Weather forecasts for a region
* Sales from similar products or related categories
* Data from nearby locations or stations
* More fine-granular recent interactions/activities
* Price changes and promotional campaigns
* Competitor data (e.g., pricing, availability)
* Aggregated regional or market-level statistics
The architecture is modular: we don't default to a Transformer for the past sequence component (and in fact use a simpler architecture). The Graph Transformer/Graph Neural Network then extends the past sequence component by aggregating from additional sources.
> It seems like this startup is both trying to publish academic research promoting these models as well as selling it to businesses which seems like a conflict of interest to me.
That’s a bold claim. All of our academic work is conducted in collaboration with university partners, is peer-reviewed, and has been accepted at top-tier conferences. Sharing blog posts that explain the design decisions behind our models isn’t a conflict of interest—it's part of making our internals more transparent.
Lol, a bold claim. It's a rational assumption that any business publishing "academic work" is selling you the upside while omitting or downplaying the downside.
Would you be so kind as to recommend some resources on modern, promising methods for time series forecasting? I'm starting a position doing this work soon and would like to learn more about it if you'd be willing to share
Read all the M series of competitions and the papers that come out of those exercises. Read Keogh. Also have a healthy respect and understanding of the traditional methods rather than getting distracted by all that happens to be shiny now.
Wow a sane person among all the hype. Great to see you!
Lol. Yeah, the hype train blinds.
Recent work like Informer (AAAI'21) and Autoformer (NeurIPS'21) have shown competitive performance against statistical methods by addressing the quadratic complexity and long-range dependency issues that plagued earlier transformer architectures for time series tasks.
thoughts on TimesFM?
> After looking further it seems like this startup is both trying to publish academic research promoting these models as well as selling it to businesses, which seems like a conflict of interest to me.
is this a general rule of thumb that one should not use the same organization to publish research and pursue commercialization generally?
Not really. There is no rule against it. You can have a team that research, publishes, patents and shares the patents with commercial scalers. It’s easier with ML than with manufacturing.
https://dontfuckwithscroll.com/
Forwarded :)
“Here, sign this.”
I can't stand websites that override scrolling
Most of my time interacting with this site was spent in developer tools, trying to figure out where the scrolling behavior was coming from. (Couldn't figure it out.) I can't understand why people are still doing this in 2025.
Enter this in the console:
document.body.onwheel = (e) => e.stopPropagation();
Most likely the developer is using a Windows computer.
wow didn't realize that until I saw this comment, now I cant unrealize it and angry
I came here to say this. Don't mess with my scrollbar. Ever.
[dead]
[flagged]
Prophet is great and we use it for multiple models in production at work. Our industry has tons of weird holidays and seasonality and prophet handles that extremely well.
We also used it at my previous job. Yes it does handle that well, but it was also simply not as correct as we would have liked (often over adjusting based on seasonality) even with tuning. Prophet was probably the right choice initially though just on how easy it is to set up to get decent results.
This is sales research, and after "CAGR in a GSheet" FB Prophet is what's going to be most recognizable to the widest base of customers.
FWIW seems like the real value add is this relational DB model: https://kumo.ai/research/relational-deep-learning-rdl/ The time-series stuff is them just elaborating the basic model structure a little more to account for time-dependence
For such strong and personal statement I have to ask why.
If you arrived into, say, London and googled "Best fish and chips" would you believe that the top result gives you the meal that you're after?
…yes? Feels like there’s some bit of tribal knowledge required to understand your point, but fewer people know it than you think.
as a Londoner I want to urge you to rethink your position.
I would believe those are some of the better options and definitely a useful benchmark. 1. How do you go about finding the "absolute" best when you go to a city 2. What does this have to do with the GP's question?
Why not? It’s definitely a useful benchmark
Why? That is what everybody uses. What do you use?
L1-regularized autoregressive features, holiday dummies, Fourier terms (if suitable in combination) yield lower test errors, are faster in training, and easier to cross-validate than Prophet.
Sounds like prophet with extra steps
With which library though? Is it fast enough for production?
[flagged]
> If this really worked, you'd be making billions on the stock market
That's kind of a weird thing to say given that the market cap for quantitative finance is well over a billion dollars, and this product clearly seems to be targeting that sector (plus others) as a B2B service provider. Do you think that all those quantitative trading firms are using something other than time-series analytics?
Also, setting aside the issue of whether time-series forecasting is valuable for stock-market trading, it seems like the value add of this product isn't necessarily the improved accuracy of the forecasts, but rather the streamlined ETL -> Feature Engineering -> Model Design process. For most firms (either in quantitative finance or elsewhere) that's the work of a small dedicated team of highly-trained specialists. This seems like it has the potential to greatly reduce the labor requirements for such an organization without a concomitant loss of product quality.
I can’t read the original other than what was quoted, but time series will never make you millions in stocks (or millions more than not using it) because time series uses past information to predict a time series, and much of stock pricing is already set via efficient market that those firms have already beat you to.
[flagged]
I don't think the people who wrote the paper are the same people who built the website.