"I Stopped Pretending Every AI Provider Was the Same"
The Trap of Surface-Level Compatibility
The easiest way to make an AI gateway feel flaky is to pretend every upstream model works the same way. On paper, a lot of tools look compatible. They all take a prompt. They all return text. Some of them even share an OpenAI-shaped API.
In practice, the differences show up exactly where users stop forgiving you:
- a tool-specific field gets dropped
- an image payload works on one route and breaks on another
- a model switch silently changes behavior
- the request succeeds, but the wrong capability set was assumed
That was one of the most useful lessons while building CliGate, my local control plane for Claude Code, Codex CLI, Gemini CLI, OpenClaw, a resident assistant, and multiple model/account sources behind one localhost entrypoint.
The bug was not "routing failed." The bug was subtler than that. Routing often did succeed. A request got sent somewhere. A response came back. Nothing obviously crashed. But that did not mean the gateway was correct.
Capability Routing vs. Transport Routing
If you route different tools and providers as if they were interchangeable, you get a class of failures that are hard to spot from logs alone:
- Claude-style payloads that need translation, not passthrough
- Codex-compatible flows that should degrade unsupported fields instead of forwarding them blindly
- Gemini paths that need their own capability assumptions
- local or fallback routes that are reachable but not feature-equivalent
That is not just transport routing. That is capability routing. I had to separate "where to send it" from "what this destination can really do."
At first, it is tempting to think routing is just: pick provider -> send request. That model is too small.
What actually mattered in CliGate was closer to this:
- identify caller/tool
- identify protocol shape
- resolve provider/model source
- apply capability profile
- translate or degrade fields safely
- send upstream
A provider being reachable is not enough. It also needs to be treated according to the features it really supports.
Translation as Part of Routing
One of the more useful internal lessons in this project is that protocol translation is not a separate cleanup step after routing. It is part of routing. Some paths can accept a richer request shape. Some need fields normalized or stripped before the request becomes a silent bug.
That changed the safe mental model from: "upstream did not complain, so the route must be fine" to: "this route supports a specific capability profile, so normalize on purpose." That sounds small, but it prevents a lot of "works sometimes" behavior.
"Compatible API" Is Not Compatible Behavior
This is the trap. Lots of systems advertise compatibility because they accept a familiar endpoint shape. But compatibility at the HTTP layer is only the beginning.
If one tool expects richer reasoning or metadata semantics and another backend treats those fields differently, the gateway has three bad choices:
- pass everything through and let undefined behavior happen
- reject too aggressively and feel brittle
- normalize by capability and keep behavior predictable
Only the third one scales. That is why I now prefer capability-aware routing over a universal passthrough design.
Caller Identity Matters More Than I Expected
claude-code, codex, gemini-cli, openclaw, and generic OpenAI/Anthropic-compatible clients may hit similar-looking routes, but they are not interchangeable from an operator's perspective.
The user is often really asking for one of these:
- keep Claude Code on the provider/model path that fits Claude-style flows
- bind Codex to a specific account or key
- let Gemini use its own capability profile
- fall back safely when a source is unavailable
That is why app-aware routing and capability-aware translation ended up being complementary, not separate concerns. One decides who this request is for. The other decides how to make it truthful on the way through.
Degrade Intentionally, Never Accidentally
The worst failures are the accidental ones. If a gateway quietly forwards a field that the destination ignores, the user may never know why results became inconsistent.
So I started preferring explicit degradation rules:
- If a route cannot honor a field, normalize it on purpose.
- If a provider cannot match a capability, map it honestly.
- If a model source is rate-limited or invalid, skip it instead of pretending all active-looking credentials are equal.
That gives me a much better operator story:
- why this source was chosen
- which capability profile was applied
- which fields were transformed or removed
- why a different route would behave differently
Reliability Improved When I Stopped Chasing "Perfect Abstraction"
A good gateway should hide repetitive setup work. It should not lie about capability differences. Once I accepted that, the architecture became cleaner:
- route by app and protocol
- map by provider/model source
- translate by capability profile
- expose the differences clearly in logs and settings
That is less magical, but much more dependable.
The Rules I Would Keep
If I were designing another AI gateway tomorrow, I would keep these rules:
- do not equate API shape with feature equivalence
- make caller identity a first-class routing input
- treat translation as part of routing
- degrade unsupported fields deliberately
- expose capability decisions so operators can explain failures
That is the direction I have been pushing with CliGate. The project still aims to give me one local place for model routing, accounts, API keys, local runtimes, channels, runtime sessions, and an assistant layer. But the system became much more trustworthy once I stopped pretending every upstream provider was the same.
If you run multiple AI tools through one gateway, are you doing plain endpoint routing, or routing by actual capability too?
Comments
No comments yet. Start the discussion.