All posts
ReleasesApr 22, 20262 min read

Introducing Apinizer 2026.04 — AI Gateway is here

The same gateway that runs your REST APIs now governs every LLM, MCP server, and agent-to-agent message — under the same audit and permission model.

SD

Selin Demir

VP Product

After eighteen months of pilot deployments with banks, ministries, and AI-first teams, we are shipping AI Gateway as a first-class data plane in Apinizer 2026.04. Every Pro and Enterprise customer now gets the same governance surface for their REST APIs and their LLM, MCP, and agent-to-agent traffic.

Why we built it

Three patterns kept showing up in customer conversations.

First, AI traffic was already running. Engineering teams were calling OpenAI, Anthropic, and Bedrock from production. The platform team had no audit trail, no quota, no PII redaction, and no cost ceiling.

Second, every team building an agent was reinventing the same plumbing — provider routing, retry, prompt firewalls, cost attribution. Each implementation was slightly wrong.

Third, the operators we trust did not want a second gateway. They wanted the audit, encryption, and permission model they already had — applied to AI traffic.

The same MessageContext, the same policy pipeline, the same Repository.save audit — for REST and AI. No second runtime to operate.

What ships in 2026.04

  • Multi-LLM routing across 17 providers — OpenAI, Azure OpenAI, Amazon Bedrock, Anthropic, Gemini, Vertex AI, Cohere, Mistral, Hugging Face, Llama, xAI, DashScope, Cerebras, DeepSeek, Ollama, Databricks, vLLM. One OpenAI-compatible facade.
  • MCP server governance — auto-generate MCP servers from your existing APIs, govern agent discovery, audit every agent-to-agent message.
  • Semantic caching + prompt firewalls — drop token spend on repeat prompts, block injection patterns at the edge, sanitize PII on request and response.
  • Token economics — per-user, per-model, per-window quotas with cost attribution to Project / Team scopes.
  • AI observability — token spend, tool-use breakdown, TTFT, and provider latency in the same Analytics Engine your operators already use.
// Route to the cheapest LLM provider that meets a latency target
def provider = ai.routes.cheapest(latencyMs: 800, modelClass: 'mid')
ai.proxy.send(provider, request.body)

Upgrade path

For existing Pro and Enterprise customers, AI Gateway is a license update — no migration. Configurations, manifests, and the API Manager UI all carry over. The Worker hot-reloads to pick up the new providers.

For Open customers, AI Gateway is part of Pro. Talk to the team if you want to evaluate it.

What's next

We are already deep into 2026.09 — broader MCP catalog support, deeper agent identity federation, and a public conformance test suite for the provider integrations. If your team is running agents in production and would shape that work, the door is open.

  • #ai-gateway
  • #release
  • #llm
  • #mcp
  • #a2a

See it on your cluster

Walk through the platform with us.

A 30-minute tour of Manager, Worker, AI Gateway, and APIops on a Kubernetes of your choice.