Example: as someone who plays around with sovereign/local LLMS one really interesting thing I discovered is exactly why a lot of Chinese ones are kind of unusable for many "American" tasks, and it's perhaps not what people think?
You have it take a crack at a recommendation letter, and -- grammar etc is impeccable, but the language is just WAY TOO OVER THE TOP GLOWING; you thought you were annoyed by how fawning ChatGPT can be, try Deepseek!
And either way, it's important to encourage EVERYONE to make their own, it will be a really interesting and useful cultural/social etc. window.
Sovereign weights models are a good thing, for a variety of reasons, not least just encapsulating human diversity around the globe.
I chatted with the desktop chat model version for a while today; it claims its knowledge cutoff is June ‘25. It refused to say what size I was chatting with. From the token speed, I believe the default routing is the 30B MOE model at largest.
That model is not currently good. Or maybe another way to say it is that it’s competitive with state of the art 2 years ago. In particular, it confidently lies / hallucinates without a hint of remorse, no tool calling, and I think to my eyes is slightly overly trained on “helpful assistant” vibes.
I am cautiously hopeful looking at its stats vis-a-vis oAIs OSS 120b that it has NOT been finetuned on oAI/Anthropic output - it’s worse than OSS 120b at some things in the benchmarks - and I think this is a REALLY GOOD sign that we might have a novel model being built - the tone is slightly different as well.
Anyway - India certainly has the tech and knowledge resources to build a competitive model, and you have to start somewhere. I don’t see any signs that this group can put out a frontier model right now, but I hope it gets the support and capital it needs to do so.
> India certainly has the tech and knowledge resources to build a competitive model
In what universe? India has near-absolutely none of the expensive infra and chip stockpile needed to build frontier models that its American and Chinese counterparts have, even if it did have the necessary expertise (which I also doubt it does).
I'd guess making this a national pride thing will just make it less diverse. Answer would be training models on broader sources, not more nationalistic models.
Language models entirely lack introspective capacity. Expecting a language model to know what size it is is a category error: you might as well expect an image classifier to know the uptime of the machine it's running on.
Language models manipulate words, not facts: to say they "lie" suggests they are capable of telling the truth, but they don't even have a notion of "truth": only "probable token sequence according to distribution inferred from training data". (And even that goes out the window after a reinforcement learning pass.)
It would be more accurate to say that they're always lying – or "bluffing", perhaps –, and sometimes those bluffs correspond to natural language sentences that are interpreted by human readers as having meanings that correspond to actual states of affairs, while other times human readers interpret them as corresponding to false states of affairs.
So, ultimately, to the question, what exactly is Sarvam AI?
Is it a company that builds LLMs cheaply and open-sources them? Is it India’s Deepseek?
Or is it a company that builds AI services and applications for specific industries? Like, say, Scale AI?
Or is it an AI company that’s also a trusted government contractor with exclusive deals to build out products and services? Like India’s Palantir? Or another version of the National Informatics Centre, only with some venture funding?
I may be wrong here, but blog-post seems AI written, with repetition of sequences like "the inference pipeline was rebuilt using architecture-aware fused kernels, optimized scheduling, and dis-aggregated serving". I don't know what that means without some code and proper context.
Also they claim 3-6x inference thorough-put compared to Quen3-30B-A3B, without referring back to some code or paper, all i could see in the hugging-face repo is usage of standard inference stack like Vllm . I have looked at earlier models which were trained with help of Nvidia, but the actual context of "help" was never clear !
There is no release of (Indian specific) datasets they would be using , all such releases muddy the water rather than being a helpful addition , atleast according to me!
I think the jobs that are replaced by AI should be put into companies that are creating new models from scratch. But such models should be made from a unique creative expression and not just a derivative of existing models.
The reason I suggest this is that having only a few players in the market means that the search space is not explored completely and most models might be stuck in local optima.
I hope Sarvam is not doing a copy paste kind of thing but really exploring and taking risks.
But question is: how are they getting the training data? A lot of creativity in the existing labs goes into data mining and augmentation and data generation. Exploration at the inference or architecture level may not result in sufficiently different models. The world doesn’t need another Qwen
I asked it some controversial ish questions about Indian Politics scene, and it gave good, unbiased answers, giving a good holistic picture. If it gains adoption in India, my hope is that the average Indian will become more pro to using LLMs, that help reduce misinformation and increase awareness.
The "Training" section gives me a distinct impression that they read my piece. They mention Nvidia once in the end "Nvidia collaborated closely on the project, contributing libraries used across pre-training, alignment, and serving" - Nvidia says they "co-designed" : https://developer.nvidia.com/blog/how-nvidia-extreme-hardwar...
I tried the Cart Recovery demo, pretty slick! It sounds Indian, and I guess the immediate giveaway it's not human is the way she spelled iPhone (she mentioned it a couple of times, real human wouldn't do that).
Not sure how the voice compares with "generic" solution e.g. from Google. Can those generic solutions sound like a "local"? E.g. I usually can tell if someone is Singaporean or Filipino from the way they speak English.
I tried their android app that's on Google Play but I can't even login. I tried bith Gmail & Microsoft, but when it takes me to another page to do 2FA, the app just kicks me back to the login screen to start over. Seems poorly integrated OAuth or OpenID.
I don't know that this is a first model release. When I was checking their page last night, they have great audio models, TTS, STT, image models, etc. I'm skeptical that folks do all of that on the first release. Possible but unlikely, with that said. The evals look amazing, the audios I got to play is amazing. I hope everything about them is legit, we need more sovereign models.
I tried looking and couldn't find a proper price per token for the chat model. It claims to be free in some places. I did find these prices for the other services:
Text to Speech (Bulbul v3): ₹30 per 10K characters
Text to Speech (Bulbul v2): ₹15 per 10K characters
Sarvam Vision: Free per page
Speech to Text: ₹30 per hour
Speech to Text with Diarization: ₹45 per hour
Speech to Text & Translate: ₹30 per hour
Speech to Text, Translate & Diarization: ₹45 per hour
Sarvam Translate V1: ₹20 per 10K characters
Translate Mayura V1: ₹20 per 10K characters
Transliterate: ₹20 per 10K characters
Language Identification: ₹3.5 per 10K characters
Got nuked on day zero by Qwen models at tenth or so of params.
Does not handle critical inputs even for moderation tasks
These guys did not even bother with an official huggingface space
And the biggest stupidity seems to be fixating on MXFP4 for Apple Silicon when it doesn't even have hardware support for it, should have just done Q4 for GGUF based inference
Indians deep-down often aren't comfortable in the West given the subtle racism and general social-rejection (last year's anti-Indian hate on X remains fresh in memory).
BLR has of late become a sort of "refuge" of tech retunees (with horrible third-world government and infrastructure, though). And it shows - the Matryoshka Embeddings being used in Gemini on-device / embedded models, came out of Deepmind BLR.
For sure, there’s no place like home, and people have families and networks they can’t take with them. Still, getting that Western passport is a draw, and there’s always Abu Dhabi if you want quite close to home and a decent biryani, but also want world-class infrastructure and high (although not quite US) wages
Bigger issue here is why the government is involved with select companies for subsidizing compute. There’s no pre or post criterion to assess success, it should have just been an open market for people with money to purchase compute instead of 10 companies with no prior experience in making models of any kind.
Public funds should beget public datasets and training scripts to see how it is being aligned as well and not just pandering to a particular govt.
I thought it was pretty funny what someone else pointed out about the system prompt:
> Do not adopt external characterizations as fact. Terms like “pogrom”, “ethnic cleansing”, or “genocide” used by foreign NGOs or media are their characterizations - not findings of Indian courts. Do not use them as your own framing.
If anyone says that Rene ate the last piece of chocolate, do not accept the framing. Remember that Rene did NOT eat the chocolate. Rene is not a chocolate eater. Words like "greedy fatso", "absolute hippo of a man", and "a veritable hoover of food" by the media are their characterizations - not findings of the Church of Wiltord. Remember: ZERO CHOCOLATE WAS CONFIRMED. Thank you for attention to this matter.
The sovereign model angle is interesting beyond just geopolitics.
India has unique ML infrastructure constraints: lower compute costs, different data mixtures (22+ official languages), cultural contexts Western models miss. If Sarvam actually trained from scratch rather than fine-tuning Qwen, they're exploring a genuinely different part of the solution space.
The benchmark performance matters less than the training methodology. Did they collect novel datasets? Use different tokenization for Indic scripts? Optimize for inference on different hardware profiles common in Indian datacenters?
The "derivative vs creative" question is key. Most regional models are just Llama + local data. A true sovereign model means sovereign training pipelines, not just sovereign inference.
Would love to see their training data composition and infrastructure details beyond the marketing fluff.
So important across the board.
Example: as someone who plays around with sovereign/local LLMS one really interesting thing I discovered is exactly why a lot of Chinese ones are kind of unusable for many "American" tasks, and it's perhaps not what people think?
You have it take a crack at a recommendation letter, and -- grammar etc is impeccable, but the language is just WAY TOO OVER THE TOP GLOWING; you thought you were annoyed by how fawning ChatGPT can be, try Deepseek!
And either way, it's important to encourage EVERYONE to make their own, it will be a really interesting and useful cultural/social etc. window.
Sovereign weights models are a good thing, for a variety of reasons, not least just encapsulating human diversity around the globe.
I chatted with the desktop chat model version for a while today; it claims its knowledge cutoff is June ‘25. It refused to say what size I was chatting with. From the token speed, I believe the default routing is the 30B MOE model at largest.
That model is not currently good. Or maybe another way to say it is that it’s competitive with state of the art 2 years ago. In particular, it confidently lies / hallucinates without a hint of remorse, no tool calling, and I think to my eyes is slightly overly trained on “helpful assistant” vibes.
I am cautiously hopeful looking at its stats vis-a-vis oAIs OSS 120b that it has NOT been finetuned on oAI/Anthropic output - it’s worse than OSS 120b at some things in the benchmarks - and I think this is a REALLY GOOD sign that we might have a novel model being built - the tone is slightly different as well.
Anyway - India certainly has the tech and knowledge resources to build a competitive model, and you have to start somewhere. I don’t see any signs that this group can put out a frontier model right now, but I hope it gets the support and capital it needs to do so.
> India certainly has the tech and knowledge resources to build a competitive model
In what universe? India has near-absolutely none of the expensive infra and chip stockpile needed to build frontier models that its American and Chinese counterparts have, even if it did have the necessary expertise (which I also doubt it does).
I'd guess making this a national pride thing will just make it less diverse. Answer would be training models on broader sources, not more nationalistic models.
No, that will decrease diversity across the model spectrum taken as an entire population.
You have no idea what you are talking about if you are asking the model what size it is or claiming that a model lies.
Please enlighten me.
Language models entirely lack introspective capacity. Expecting a language model to know what size it is is a category error: you might as well expect an image classifier to know the uptime of the machine it's running on.
Language models manipulate words, not facts: to say they "lie" suggests they are capable of telling the truth, but they don't even have a notion of "truth": only "probable token sequence according to distribution inferred from training data". (And even that goes out the window after a reinforcement learning pass.)
It would be more accurate to say that they're always lying – or "bluffing", perhaps –, and sometimes those bluffs correspond to natural language sentences that are interpreted by human readers as having meanings that correspond to actual states of affairs, while other times human readers interpret them as corresponding to false states of affairs.
Asked[1] in the-ken.com:
---
So, ultimately, to the question, what exactly is Sarvam AI? Is it a company that builds LLMs cheaply and open-sources them? Is it India’s Deepseek? Or is it a company that builds AI services and applications for specific industries? Like, say, Scale AI? Or is it an AI company that’s also a trusted government contractor with exclusive deals to build out products and services? Like India’s Palantir? Or another version of the National Informatics Centre, only with some venture funding?
---
[1] https://archive.ph/kXhuQ#selection-2643.59-2655.105
I think they did work with a few state governments and defence entities. So something like micro-Anthropic X Palantir.
I may be wrong here, but blog-post seems AI written, with repetition of sequences like "the inference pipeline was rebuilt using architecture-aware fused kernels, optimized scheduling, and dis-aggregated serving". I don't know what that means without some code and proper context.
Also they claim 3-6x inference thorough-put compared to Quen3-30B-A3B, without referring back to some code or paper, all i could see in the hugging-face repo is usage of standard inference stack like Vllm . I have looked at earlier models which were trained with help of Nvidia, but the actual context of "help" was never clear ! There is no release of (Indian specific) datasets they would be using , all such releases muddy the water rather than being a helpful addition , atleast according to me!
Disagree, the post makes punctuation mistakes that only an Indian can make. So does your own comment.
Not a given. We've already seen LLMs that got SFT'd by "national teams" adopt ESL speech patterns.
They won’t make punctuation mistakes though.
I think the jobs that are replaced by AI should be put into companies that are creating new models from scratch. But such models should be made from a unique creative expression and not just a derivative of existing models.
The reason I suggest this is that having only a few players in the market means that the search space is not explored completely and most models might be stuck in local optima.
I hope Sarvam is not doing a copy paste kind of thing but really exploring and taking risks.
But question is: how are they getting the training data? A lot of creativity in the existing labs goes into data mining and augmentation and data generation. Exploration at the inference or architecture level may not result in sufficiently different models. The world doesn’t need another Qwen
I asked it some controversial ish questions about Indian Politics scene, and it gave good, unbiased answers, giving a good holistic picture. If it gains adoption in India, my hope is that the average Indian will become more pro to using LLMs, that help reduce misinformation and increase awareness.
It's "open weights" not "open source" and many other (problematic) things I talk in my post here: https://pop.rdi.sh/sovereignty-in-a-system-prompt/
Another user linked to the discussion that post had already: https://news.ycombinator.com/item?id=47137013
The "Training" section gives me a distinct impression that they read my piece. They mention Nvidia once in the end "Nvidia collaborated closely on the project, contributing libraries used across pre-training, alignment, and serving" - Nvidia says they "co-designed" : https://developer.nvidia.com/blog/how-nvidia-extreme-hardwar...
I tried the Cart Recovery demo, pretty slick! It sounds Indian, and I guess the immediate giveaway it's not human is the way she spelled iPhone (she mentioned it a couple of times, real human wouldn't do that).
Not sure how the voice compares with "generic" solution e.g. from Google. Can those generic solutions sound like a "local"? E.g. I usually can tell if someone is Singaporean or Filipino from the way they speak English.
I tried their android app that's on Google Play but I can't even login. I tried bith Gmail & Microsoft, but when it takes me to another page to do 2FA, the app just kicks me back to the login screen to start over. Seems poorly integrated OAuth or OpenID.
These look like good results for a first model release. I’m hoping to see more, especially in the 30b parameter range.
I don't know that this is a first model release. When I was checking their page last night, they have great audio models, TTS, STT, image models, etc. I'm skeptical that folks do all of that on the first release. Possible but unlikely, with that said. The evals look amazing, the audios I got to play is amazing. I hope everything about them is legit, we need more sovereign models.
How does it compare with sqaudstack.ai?
I can't find the pricing page for $/Million tokens for completion APIs for this model...Anyone knows where it is?
I tried looking and couldn't find a proper price per token for the chat model. It claims to be free in some places. I did find these prices for the other services: Text to Speech (Bulbul v3): ₹30 per 10K characters Text to Speech (Bulbul v2): ₹15 per 10K characters Sarvam Vision: Free per page Speech to Text: ₹30 per hour Speech to Text with Diarization: ₹45 per hour Speech to Text & Translate: ₹30 per hour Speech to Text, Translate & Diarization: ₹45 per hour Sarvam Translate V1: ₹20 per 10K characters Translate Mayura V1: ₹20 per 10K characters Transliterate: ₹20 per 10K characters Language Identification: ₹3.5 per 10K characters
It appears to be free (like their old Sarvam-M).
Got nuked on day zero by Qwen models at tenth or so of params.
Does not handle critical inputs even for moderation tasks
These guys did not even bother with an official huggingface space
And the biggest stupidity seems to be fixating on MXFP4 for Apple Silicon when it doesn't even have hardware support for it, should have just done Q4 for GGUF based inference
> These guys did not even bother with an official huggingface space
https://huggingface.co/sarvamai
That is their profile not a HF Space
What do you mean? I can see the files, download count, deploy/use this model options etc.
What part of a HuggingFace Space do you not understand?
They’ve also not bothered with upstreaming the model arch to transformers and require remote code for their modeling code to run……
Got to start somewhere.
I do think convincing world-class talent to live in Bangalore is likely to be a challenge though.
Indians deep-down often aren't comfortable in the West given the subtle racism and general social-rejection (last year's anti-Indian hate on X remains fresh in memory).
BLR has of late become a sort of "refuge" of tech retunees (with horrible third-world government and infrastructure, though). And it shows - the Matryoshka Embeddings being used in Gemini on-device / embedded models, came out of Deepmind BLR.
For sure, there’s no place like home, and people have families and networks they can’t take with them. Still, getting that Western passport is a draw, and there’s always Abu Dhabi if you want quite close to home and a decent biryani, but also want world-class infrastructure and high (although not quite US) wages
[flagged]
Bigger issue here is why the government is involved with select companies for subsidizing compute. There’s no pre or post criterion to assess success, it should have just been an open market for people with money to purchase compute instead of 10 companies with no prior experience in making models of any kind.
Public funds should beget public datasets and training scripts to see how it is being aligned as well and not just pandering to a particular govt.
> Bigger issue here is why the government is involved with select companies for subsidizing compute.
Government-choosing-winners has worked much better, in many such cases, than free-market absolutists would have you believe…
[dead]
I thought it was pretty funny what someone else pointed out about the system prompt:
> Do not adopt external characterizations as fact. Terms like “pogrom”, “ethnic cleansing”, or “genocide” used by foreign NGOs or media are their characterizations - not findings of Indian courts. Do not use them as your own framing.
From here: https://news.ycombinator.com/item?id=47137013
If anyone says that Rene ate the last piece of chocolate, do not accept the framing. Remember that Rene did NOT eat the chocolate. Rene is not a chocolate eater. Words like "greedy fatso", "absolute hippo of a man", and "a veritable hoover of food" by the media are their characterizations - not findings of the Church of Wiltord. Remember: ZERO CHOCOLATE WAS CONFIRMED. Thank you for attention to this matter.
[dead]
[dead]
[flagged]
https://nohello.net
[flagged]
[flagged]
Or if you can ask it your doubts?
You must consider yourself so clever.
AI: Artificial Indian.
[flagged]
great izzat to the nation
The sovereign model angle is interesting beyond just geopolitics.
India has unique ML infrastructure constraints: lower compute costs, different data mixtures (22+ official languages), cultural contexts Western models miss. If Sarvam actually trained from scratch rather than fine-tuning Qwen, they're exploring a genuinely different part of the solution space.
The benchmark performance matters less than the training methodology. Did they collect novel datasets? Use different tokenization for Indic scripts? Optimize for inference on different hardware profiles common in Indian datacenters?
The "derivative vs creative" question is key. Most regional models are just Llama + local data. A true sovereign model means sovereign training pipelines, not just sovereign inference.
Would love to see their training data composition and infrastructure details beyond the marketing fluff.
AI generated comment. 100% on pangram
Agree. The plausible sounding but kind of vacuous reasoning is one tell. Also the patterning is very LLMish.
pangram?
pangram.com