Can you share more? I'm with OpenRouter and we would love to address this! We don't see this in our own testing, I don't believe -- but will share this feedback and dig in.
Just try. In a case last week it was ~3x and I tried multiple providers: deepseek, gmicloud/fp8, novita/fp8, and another one I can't remember. It was a large job where at least 2/3rds of the start of the prompts was exactly the same (literally a static string).
Then I read somewhere (I think X) that OpenRouter adds stuff and breaks caching (telemetry? headers? can't remember). So I stopped the job, switched to actual DeepSeek provider, and voilá, caching 3x more tokens per request (on average).
Can you share more? I'm with OpenRouter and we would love to address this! We don't see this in our own testing, I don't believe -- but will share this feedback and dig in.
Here is some data from my experience using both deepseek v4 flash directly, and deepseek v4 flash via openrouter.
Directly: 135M input tokens - $0.57 (134M cached)
Via OpenRouter 6M tokens - $0.81 (caching stats & inp/out not reported)
Caching is a huge win with using deepseek directly.
Just try. In a case last week it was ~3x and I tried multiple providers: deepseek, gmicloud/fp8, novita/fp8, and another one I can't remember. It was a large job where at least 2/3rds of the start of the prompts was exactly the same (literally a static string).
Then I read somewhere (I think X) that OpenRouter adds stuff and breaks caching (telemetry? headers? can't remember). So I stopped the job, switched to actual DeepSeek provider, and voilá, caching 3x more tokens per request (on average).
> switched to actual DeepSeek provider
I meant actual DeepSeek API.
I am experiencing this using Opencode. Caching works fine via Deepseek API but not so good via Openrouter
Yes, I definitely noticed a problem with openrouter and deepseek v4 pro. It's much more expensive.
When you say Deepseek API, you mean servers in China? Or is it a copy of the model operated and run by OpenRouter?