Show HN: MCP server gives your agent a budget (save tokens, get smarter results)

5 points by bennettdixon 1 day ago

As a consultant I foot my own Cursor bills, and last month was $1,263. Opus is too good not to use, but there's no way to cap spending per session. After blowing through my Ultra limit, I realized how token-hungry Cursor + Opus really is. It spins up sub-agents, balloons the context window, and suddenly, a task I expected to cost $2 comes back at $8. My bill kept going up, but was I really going to switch to a worse model?

No. So I built l6e: an MCP server that gives your agent the ability to budget. It works with Cursor, Claude Code, Windsurf, Openclaw, and every MCP-compatible application.

Saving money was why I built it, but what surprised me was that the process of budgeting changed the agent's behavior. An agent that understands the limitations of the resources doesn't try to speculatively increase the context window with extra files. It doesn't try to reach every possible API. The agent plans ahead, sticks to it, and ends work when it should.

It works, and we've been dogfooding it hard. After v1 shipped, the rest of l6e was all built with it. We launched the entire docs site using frontier models for $0.99. The kicker was every time l6e broke in development, I could feel the pain. The agent got sloppy, burned through context, and output quality dropped right along with it.

Install: pip install l6e-mcp

Docs: https://docs.l6e.ai

GitHub: https://github.com/l6e-ai/l6e-mcp

Website: https://l6e.ai

Happy to answer questions about the system design, calibration models, or why I can't go back to coding without it.

axeldunkel 1 day ago

Great idea, I do like the concept to give the LLM more information and context so it can decide what approach is better. But why do you implement it as a mcp server and not as a proxy to have the full context?

bennettdixon 11 hours ago

Great question! We initially started with a proxy, and plan to support it for users that already have things like LiteLLM setup. However we chose the MCP route, and soon plugin route, because it was much lower lift for installation. Setting up a proxy and configuring it for Cursor especially can be a bit tricky. There is actually still some code in the MCP server for a LiteLLM integration, which we hope to support officially soon.
Another large part of it is without MCP, and only a proxy in a client like cursor, the agent doesn't reason about the budget. When it has it has to use it as a tool, it actively thinks about the cost of its actions.

Arkidillo 1 day ago

Interesting, so you're saying the quality increased as well because it has less context bloat to deal with?

bennettdixon 11 hours ago

That has been our finding, it seems that we are able to do larger sessions in a single context window because of the lessened bloat. Certainly needs wider spread testing though which is why we are excited to get more feedback.