We need LLM query routing at the OS level like Mobile data.
I know it will sound crazy but hear me out. I think about this AI inference as infrastructure. I do not want to pay for it on every app I use it on. I do not think "I have to pay the mobile data of youtube, and the mobile data of whatsapp etc.". I pay Mobile data infrastructure and let my device route it appropiately. In fact, if we ever go the local llm route, you could have LLM capabilities without having access to the internet (or local LAN), and your OS/computer is the only one capable of doing that routing for you.
Slight tangent, but “Wayfinder sits behind whatever OpenAI-compatible client you already use” reminds me that descriptions of where proxies sit in the information flow always seem so arbitrary to me:
- “after the client”
- “reverse proxy” (in front of servers)
- “proxy” (in front of client)
I always have to look this up, surely there must be a standardized way to describe this?
Love to see local/cloud routing explicitly supported.
I'm building another router for routing between local and remote models, ShowHN coming up later today. Here's a sneak preview of the github: https://github.com/try-works/role-model
We need LLM query routing at the OS level like Mobile data. I know it will sound crazy but hear me out. I think about this AI inference as infrastructure. I do not want to pay for it on every app I use it on. I do not think "I have to pay the mobile data of youtube, and the mobile data of whatsapp etc.". I pay Mobile data infrastructure and let my device route it appropiately. In fact, if we ever go the local llm route, you could have LLM capabilities without having access to the internet (or local LAN), and your OS/computer is the only one capable of doing that routing for you.
Slight tangent, but “Wayfinder sits behind whatever OpenAI-compatible client you already use” reminds me that descriptions of where proxies sit in the information flow always seem so arbitrary to me:
I always have to look this up, surely there must be a standardized way to describe this?
"after the client" and "in front of client" can mean the same thing depending on your viewpoint.
It's funny how much that first paragraph is Claude's voice. I don't know how it got trained so hard to use, "the shape of" for everything.
Loads of ed sheeran in the training data?
Love to see local/cloud routing explicitly supported.
I'm building another router for routing between local and remote models, ShowHN coming up later today. Here's a sneak preview of the github: https://github.com/try-works/role-model
can you send to multiple LLMs to compare responses? From that create a heuristic of which LLM gets what.
It'd be nice to just have a command prefix e.g.
/local fix my typo
That’s what I did with Pi, super simple :)
This is the way!
I like to think so!