| Svelte Hacker News

points by MostlyStable 7 months ago

I still think that Deepseek was mostly dramatically overblown. It was 6-9 months behind the performance of the best frontier models at a dramatically lower cost.....but dramatically lower costs 6-9 months behind is basically what has been happening for a couple of years.

I'm mostly convinced that the only reason that it blew up was that it was the first Chinese model that was even in the same ballpark as the American frontier models, which drove a lot of reporting, which caused a lot of normies that hadn't tried any AI model since CHatGPT very first blew up to try it and they were (understandably) blown away by the progress relative to what they remembered from 2 years previously.

HarHarVeryFunny 7 months ago

The timing seems to indicate that the mainstream press publicity over DeepSeek R1 was due to an NVidia short recommendation that had just gone viral, not about the disclosed cost (which related to V3, not R1).

However, there also seems to have been some genuine panic at OpenAI, maybe elsewhere too, over DeepSeek R1 since not only did they come close to matching the performance of o1, but they also described exactly how they did it (apparently very similar to what OpenAI had done, judging from the reaction), and therefore killed any competitive lead that OpenAI - who had been working on it for a long time - may have thought they had.