DEV Community 4h ago

"I Stopped Pretending Every AI Provider Was the Same"

The Trap of Surface-Level Compatibility

The easiest way to make an AI gateway feel flaky is to pretend every upstream model works the same way. On paper, a lot of tools look compatible. They all take a prompt. They all return text. Some of them even share an OpenAI-shaped API.

In practice, the differences show up exactly where users stop forgiving you:

a tool-specific field gets dropped
an image payload works on one route and breaks on another
a model switch silently changes behavior
the request succeeds, but the wrong capability set was assumed

That was one of the most useful lessons while building CliGate, my local control plane for Claude Code, Codex CLI, Gemini CLI, OpenClaw, a resident assistant, and multiple model/account sources behind one localhost entrypoint.

The bug was not "routing failed." The bug was subtler than that. Routing often did succeed. A request got sent somewhere. A response came back. Nothing obviously crashed. But that did not mean the gateway was correct.

Capability Routing vs. Transport Routing

If you route different tools and providers as if they were interchangeable, you get a class of failures that are hard to spot from logs alone:

Claude-style payloads that need translation, not passthrough
Codex-compatible flows that should degrade unsupported fields instead of forwarding them blindly
Gemini paths that need their own capability assumptions
local or fallback routes that are reachable but not feature-equivalent

That is not just transport routing. That is capability routing. I had to separate "where to send it" from "what this destination can really do."

At first, it is tempting to think routing is just: pick provider -> send request. That model is too small.

What actually mattered in CliGate was closer to this:

identify caller/tool
identify protocol shape
resolve provider/model source
apply capability profile
translate or degrade fields safely
send upstream

A provider being reachable is not enough. It also needs to be treated according to the features it really supports.

Translation as Part of Routing

One of the more useful internal lessons in this project is that protocol translation is not a separate cleanup step after routing. It is part of routing. Some paths can accept a richer request shape. Some need fields normalized or stripped before the request becomes a silent bug.

That changed the safe mental model from: "upstream did not complain, so the route must be fine" to: "this route supports a specific capability profile, so normalize on purpose." That sounds small, but it prevents a lot of "works sometimes" behavior.

"Compatible API" Is Not Compatible Behavior

This is the trap. Lots of systems advertise compatibility because they accept a familiar endpoint shape. But compatibility at the HTTP layer is only the beginning.

If one tool expects richer reasoning or metadata semantics and another backend treats those fields differently, the gateway has three bad choices:

pass everything through and let undefined behavior happen
reject too aggressively and feel brittle
normalize by capability and keep behavior predictable

Only the third one scales. That is why I now prefer capability-aware routing over a universal passthrough design.

Caller Identity Matters More Than I Expected

claude-code, codex, gemini-cli, openclaw, and generic OpenAI/Anthropic-compatible clients may hit similar-looking routes, but they are not interchangeable from an operator's perspective.

The user is often really asking for one of these:

keep Claude Code on the provider/model path that fits Claude-style flows
bind Codex to a specific account or key
let Gemini use its own capability profile
fall back safely when a source is unavailable

That is why app-aware routing and capability-aware translation ended up being complementary, not separate concerns. One decides who this request is for. The other decides how to make it truthful on the way through.

Degrade Intentionally, Never Accidentally

The worst failures are the accidental ones. If a gateway quietly forwards a field that the destination ignores, the user may never know why results became inconsistent.

So I started preferring explicit degradation rules:

If a route cannot honor a field, normalize it on purpose.
If a provider cannot match a capability, map it honestly.
If a model source is rate-limited or invalid, skip it instead of pretending all active-looking credentials are equal.

That gives me a much better operator story:

why this source was chosen
which capability profile was applied
which fields were transformed or removed
why a different route would behave differently

Reliability Improved When I Stopped Chasing "Perfect Abstraction"

A good gateway should hide repetitive setup work. It should not lie about capability differences. Once I accepted that, the architecture became cleaner:

route by app and protocol
map by provider/model source
translate by capability profile
expose the differences clearly in logs and settings

That is less magical, but much more dependable.

The Rules I Would Keep

If I were designing another AI gateway tomorrow, I would keep these rules:

do not equate API shape with feature equivalence
make caller identity a first-class routing input
treat translation as part of routing
degrade unsupported fields deliberately
expose capability decisions so operators can explain failures

That is the direction I have been pushing with CliGate. The project still aims to give me one local place for model routing, accounts, API keys, local runtimes, channels, runtime sessions, and an assistant layer. But the system became much more trustworthy once I stopped pretending every upstream provider was the same.

If you run multiple AI tools through one gateway, are you doing plain endpoint routing, or routing by actual capability too?

Read on DEV Community ↗ ← Back to News