Show HN: GoModel – an open-source AI gateway in Go

github.com

111 points by santiago-pl 4 hours ago

Hi, I’m Jakub, a solo founder based in Warsaw.

I’ve been building GoModel since December with a couple of contributors. It's an open-source AI gateway that sits between your app and model providers like OpenAI, Anthropic or others.

I built it for my startup to solve a few problems:

  - track AI usage and cost per client or team
  - switch models without changing app code
  - debug request flows more easily
  - reduce AI spendings with exact and semantic caching

How is it different?

  - ~17MB docker image
    - LiteLLM's image is more than 44x bigger ("docker.litellm.ai/berriai/litellm:latest" ~ 746 MB on amd64)
  - request workflow is visible and easy to inspect    
  - config is environment-variable-first by default

I'm posting now partly because of the recent LiteLLM supply-chain attack. Their team handled it impressively well, but some people are looking at alternatives anyway, and GoModel is one.

Website: https://gomodel.enterpilot.io

Any feedback is appreciated.

nzoschke 1 hour ago

Looks nice, thanks for open sourcing and sharing.

I'm all in on Go and integrating AI up and down our systems for https://housecat.com/ and am currently familiar and happy with:

https://github.com/boldsoftware/shelley -- full Go-based coding agent with LLM gateway.

https://github.com/maragudk/gai -- provides Go interfaces around Anthropic / OpenAI / Google.

Adding this to the list as well as bifrost to look into.

Any other Go-based AI / LLM tools folks are happy with?

I'll second the request to add support for harnesses with subscriptions, specifically Claude Code, into the mix.

pizzafeelsright 1 hour ago

I have written and maintained AI proxies. They are not terribly complex except the inconsistent structure of input and output that changes on each model and provider release. I figure that if there is a not a < 24 hour turn around for new model integration the project is not properly maintained.

Governance is the biggest concern at this point - with proper logging, and integration to 3rd party services that provide inspection and DLP type threat mitigation.

crawdog 1 hour ago

I wrote a similar golang gateway, with the understanding that having solid API gateway features is important.

https://sbproxy.dev - engine is fully open source.

Another reason golang is interesting for the gateway is having clear control of the supply chain at compile time. Tools like LiteLLM the supply chain attacks can have more impact at runtime, where the compiled binary helps.

  • lackoftactics 1 hour ago

    Maybe worth showing on SHOW HN

    • crawdog 38 minutes ago

      Thanks I am finishing up some performance comparison work looking at rust vs golang and plan a deeper write up for that group. I hope to publish soon.

glerk 1 hour ago

This is awesome work, thanks for sharing!

How do you plan on keeping up with upstream changes from the API providers? I have implemented something similar, and the biggest issue I have faced with go is that providers don’t usually have sdk’s (compared to javascript and python), and there is work involved in staying up to date at each release.

  • lackoftactics 1 hour ago

    Almost impossible without backing from some VC like litellm

    • swyx 30 minutes ago

      ridiculous statement. most people dont need long tail.

      • lackoftactics 26 minutes ago

        I might be wrong, but if you go after 200 integrations keeping them on is substantial work for solo founder

        • swyx 24 minutes ago

          real people just need like 20 at most.

          • lackoftactics 17 minutes ago

            if that's the case and doesn't need crazy number of integrations, I agree with you 100%

  • vorticalbox 51 minutes ago

    Most APIs provide some sort of documentation. If it’s swagger you can just update the application from that.

mosselman 2 hours ago

Does this have a unified API? In playing around with some of these, including unified libraries to work with various providers, I've found you are, at some point, still forced to do provider-specific works for things such as setting temperatures, setting reasoning effort, setting tool choice modes, etc.

What I'd like is for a proxy or library to provide a truly unified API where it will really let me integrate once and then never have to bother with provider quirks myself.

Also, are you also planning on doing an open-source rug pull like so many projects out there, including litellm?

sowbug 1 hour ago

Are these kinds of libraries a temporary phenomenon? It strikes me as weird that providers haven't settled on a single API by now. Of course they aren't interested in making it easier for customers to switch away from them, but if a proprietary API was a critical part of your business plan, you probably weren't going to make it anyway.

(I'm asking only about the compatibility layer; the other tracking features would be useful even if there were only one cloud LLM API.)

  • harikb 1 hour ago

    The providers themselves can't keep this straight even within their own ecosystem. Plus everyone is running at a million miles/hour.

    For example `Claude code` used to set 2 specific beta headers with some version numbers for their Max subscription to be supported.

    Oauth tokens for Max plan is different from how their API keys looked. They kind of look similar, but has specific prefix that these tool pre-validate.

    It is barely working at this point even within a single provider

  • simonw 1 hour ago

    I've been maintaining an abstraction layer over multiple providers for a couple of years now - https://llm.datasette.io/

    The best effort we have to defining a standard is OpenAI harmony/responses - https://developers.openai.com/cookbook/articles/openai-harmo... - but it's not seen much pickup. The older OpenAI Chat Completions thing is much more of an ad-hoc standard - almost every provider ends up serving up a clone of that, albeit with frustrating differences because there's no formal spec to work against.

    The key problem is that providers are still inventing new stuff, so committing to a standard doesn't work for them because it may not cover the next set of features.

    2025 was particularly turbulent because everyone was adding reasoning mechanisms to their APIs in subtly different shapes. Tool calls and response schemas (which are confusingly not always the same thing) have also had a lot of variance - some providers allow for multiple tool calls in the same response, for example.

    My hunch is we'll need abstraction layers for quite a while longer, because the shape of these APIs is still too frothy to support a standard that everyone can get behind without restricting their options for future products too much.

pjmlp 3 hours ago

Expectable, given that LiteLLM seems to be implemented in Python.

However kudos for the project, we need more alternatives in compiled languages.

  • santiago-pl 3 hours ago

    Agree and thank you! Please let us know if you'd like to give it a try and if you miss any feature in GoModel.

  • goodkiwi 2 hours ago

    It’s also badly implemented - everything is a global import. Had to stop using it

driese 2 hours ago

Nice one! Let's say I'm serving local models via vllm (because ollama comes with huge performance hits), how would I implement that in gomodel?

  • devmor 1 hour ago

    This is way more interesting to me as well. I have projects that use small limited-purpose language models that run on local network servers and something like this project would be a lot simpler than manually configuring API clients for each model in each project.

    • santiago-pl 1 hour ago

      Thanks for raising it! Since vLLM has an OpenAI-compatible API, this should work for now:

        docker run --rm -p 8080:8080 \
          -e OPENAI_API_KEY="some-vllm-key-if-needed" \
          -e OPENAI_BASE_URL="http://host.docker.internal:11434/v1" \
          ...
          enterpilot/gomodel
      

      I'll add a more convenient way to configure it in the coming days.

Talderigi 3 hours ago

Curious how the semantic caching layer works.. are you embedding requests on the gateway side and doing a vector similarity lookup before proxying? And if so, how do you handle cache invalidation when the underlying model changes or gets updated?

  • giorgi_pro 3 hours ago

    Hey, contributor here. That's right, GoModel embeds requests and does vector similarity lookup before proxying. Regarding the cache invalidation, there is no "purging" involved – the model is part of the namespace (params_hash includes the LLM model, path, guardrails hash, etc). TTL takes care of the cleanup later.

indigodaddy 2 hours ago

Any plans for AI provider subscription compatibility? Eg ChatGPT, GH Copilot etc ? (Ala opencode)

  • santiago-pl 2 hours ago

    You are not the first person who has asked about it.

    It looks like a useful feature to have. Therefore, I'll dig into this topic more broadly over the next few days and let you know here whether, and possibly when, we plan to add it.

tahosin 3 hours ago

This is really useful. I've been building an AI platform (HOCKS AI) where I route different tasks to different providers — free OpenRouter models for chat/code gen, Gemini for vision tasks. The biggest pain point has been exactly what you describe: switching models without changing app code.

One thing I'd love to see is built-in cost tracking per model/route. When you're mixing free and paid models, knowing exactly where your spend goes is critical. Do you have plans for that in the dashboard?

  • santiago-pl 3 hours ago

    This comment looks like AI-generated.

    However IIUC what you're asking for - it's already in the dashboard! Check the Usage page.

rvz 2 hours ago

I don't see any significant advantage over mature routers like Bifrost.

Are there even any benchmarks?

  • lackoftactics 1 hour ago

    It’s a heavily vibe coded project with only proxy with terrible benchmarks design. Basically vibe coded benchmarks that lie through ignorance of mocked super fast endpoint without using full power of litellm in multiple processes.

    Other than that almost useless it’s faster when this will be io bound and not cpu bound.

    • eikenberry 20 minutes ago

      Which project are you talking about, GoModel or Bifrost?

      • lackoftactics 5 minutes ago

        GoModel. I see some red flags in the docs/benchmarks, but I could be wrong in my judgement here.

        What I noticed: the website shows a diagram of the litellm SDK communicating with the gateway proxy of GoModel, poor design of benchmarks, the scope of the project in readme vs. depth.

        I don't have professional experience in GoLang, so will not comment on quality of code.

        There are some genuinely good things about this project and the effort here, but with solid position of Bifrost sitting at a version above 1.0.0 and so many other initiatives in this space, it's a tough market.

anilgulecha 4 hours ago

how does this compare to bifrost - another golang router?

  • santiago-pl 3 hours ago

    First of all, GoModel doesn't have a separate private repository behind a paywall/license.

    It's more lightweight and simpler. The Bifrost docker image looks 4x larger, at least for now.

    IMO GoModel is more convenient for debugging and for seeing how your request flows through different layers of AI Gateways in the Audit Logs.

    • anilgulecha 3 hours ago

      That would be valuable if there's a commitment to never have a non-opensource offering under GoModel? If so, you can document it in the repo.

      • santiago-pl 3 hours ago

        I would love to keep it open source forever, but I can't promise that for now. I've written a whole doc page about it if you're curious: https://gomodel.enterpilot.io/docs/about/license

        • antonvs 30 minutes ago

          If your concern is someone selling GoModel as a service, you could add a license provision for that. Technically it'd no longer be open source, I think, but most people won't care.