Platform teams · Runtime

Deploy at lunch. Don't tell ops.

Every definition, policy, and transform hot-loads on the running gateway. No restarts, no rolling reboots, no lost in-flight requests — and the distributed cache survives every pod reschedule.

Hot deploy & cache — For platform teams use case overview from Apinizer.
For platform teams · Hot deploy & cache

The problem

Most gateways pretend a deploy is a small thing. It isn't.

A redeploy means a rolling restart. A rolling restart means warm caches go cold, connection pools rebuild, JWKs re-fetch, and tail latency spikes for ten minutes. Engineers learn to deploy on Fridays at 04:00 — which means they don't deploy. Apinizer's runtime treats a deploy as a config refresh: the gateway picks up the new shape in place, and the distributed cache holds steady.

  • 0

    restarts

    per definition apply

  • <200ms

    propagation

    across Workers

  • 0

    in-flight requests lost

Capabilities

What Apinizer does here

Zero-restart deploys

Apply a definition — the gateway picks up the new shape in place. In-flight requests finish on the old shape; new ones land on the new.

Distributed cache

Hazelcast-backed cache shared across every Worker. Survives reschedules; invalidations propagate cluster-wide in under a second.

Coordinated invalidation

Bust a cache key once; every Worker drops it. No more 'one node has the stale value' tickets.

Per-key TTLs

Cache strategies per endpoint, per consumer, per response variant. Fast lanes for hot data, conservative TTLs for sensitive lookups.

Pre-warmed pools

Connection pools, JWK sets, OAuth introspection caches — all kept warm across deploys. New pods inherit warm state from peers.

Cache analytics

Hit rate, miss rate, eviction reason — per endpoint, per region. Tune what's worth caching before it becomes a cost problem.

Use cases

In production, this looks like…

  • Retail

    Istanbul e-commerce ships 40 policy changes during Black Friday

    Rate limits, headers, A/B routes — all hot-deployed during peak. Zero rolling restarts, zero abandoned carts attributable to platform.

    40 deploys, 0 restarts

  • Banking

    Frankfurt private bank refreshes JWKs without a maintenance window

    Identity provider rotates keys. The gateway picks up the new JWKs in place; in-flight tokens validate on the old key, new ones on the new.

  • Public transit

    Amsterdam transit authority caches schedule API across 6 Workers

    97% cache hit rate on schedule lookups during peak commute. Invalidation on disruption fans out in 400ms.

    97% hit rate

  • Insurance

    Stockholm insurer A/B-tests a new pricing endpoint mid-day

    Hot-deploy carves 5% of traffic to the new shape. Telemetry confirms; the next apply moves the cutover to 100%. No deploy window negotiated.

  • Banking

    Warsaw bank serves loan calculator from cache during marketing campaign

    Calculator API absorbs 12k RPS during a TV ad. Backend never sees more than 200 RPS; cache shoulders the rest with per-input TTLs.

    60× backend reduction

  • Telecom

    Prague carrier ships pod reschedules without losing cache warmth

    Hazelcast keeps state across the cluster. When a pod reschedules, the new pod inherits the cache from peers — no cold-start storm to the backend.

  • Media

    Milan publisher invalidates 4M cache keys per breaking story

    Editorial publishes; the gateway invalidates affected keys cluster-wide. CDN pulls fresh in under a second; readers never see stale headlines.

  • E-government

    Baku ministry refreshes 200 endpoints during business hours

    APIops applies land hot. The platform team stops scheduling deploys for nights; the operations runbook loses its longest section.

How it works

Apply, propagate, serve — in that order, every time.

  1. Step 01

    Apply

    Push a definition via UI, APIops, or pipeline. The Manager validates and broadcasts to every Worker.

  2. Step 02

    Propagate

    Workers receive the change in under a second. The cache layer keeps hot keys; nothing is dropped.

  3. Step 03

    Serve

    In-flight requests complete on the old shape. New requests land on the new — no restart, no warm-up, no spike.

  4. Step 04

    Observe

    Analytics shows the cutover, cache hit rates, and tail latency in real time. Roll back is another apply.

Deploy without the dread

Push at noon. Sleep at midnight.

A 30-minute walkthrough — apply, propagate, cache, observe — on a Kubernetes of your choice.