Enterprise Gateway Layer 2026: LiteLLM, Portkey, Cloudflare, Kong, AWS Strands & Privacy Router

4. Juni 202611 min readDeep Dive

TL;DR: „To put an enterprise gateway in front of your agents, combine LiteLLM (multi-provider) + Portkey (governance) + Privacy Router (GDPR routing) today. Cloudflare is the fastest edge start, Kong the choice for regulated sectors, AWS Strands for pure AWS stacks."

— Till Freitag

Why an enterprise gateway in the first place?

Once more than one team uses LLMs in production, you need four things in one place:

Auth & RBAC – who may use which model with which budget?
Observability & logging – who asked what, when, what did it cost, what came back?
Routing by model/vendor – failover, cost optimization, PII awareness.
Rate limiting & quotas – per team, per model, per time of day.

Microsoft Scout was announced as an integrated enterprise gateway – but is not yet available. To go live in mid-2026, combine the following building blocks.

Decision flowchart: which gateway fits you?

Start: Do you need an LLM gateway?
  │
  ├─ GDPR-strict / on-premise mandate?
  │     │
  │     ├─ Yes →  Self-hosted OpenClaw + Privacy Router
  │     │         (local models, no hyperscaler)
  │     │
  │     └─ No  → continue ↓
  │
  ├─ Pure AWS stack with compliance mandate?
  │     │
  │     ├─ Yes →  AWS Strands / Bedrock AgentCore
  │     │         (IAM, CloudTrail, Bedrock models)
  │     │
  │     └─ No  → continue ↓
  │
  ├─ Regulated industry (bank, pharma, public sector)?
  │     │
  │     ├─ Yes →  Kong AI Gateway (self-hosted or Konnect EU)
  │     │         (mTLS, OAuth/OIDC, audit trails, plugin ecosystem)
  │     │
  │     └─ No  → continue ↓
  │
  ├─ Need PII redaction & prompt governance?
  │     │
  │     ├─ Yes →  Portkey AI Gateway (in front of LiteLLM)
  │     │         (guardrails, prompt versioning, A/B tests)
  │     │
  │     └─ No  → continue ↓
  │
  ├─ High volumes with cache potential & global edge?
  │     │
  │     ├─ Yes →  Cloudflare AI Gateway
  │     │         (DNS entry, 5 min, instant logs & cost caps)
  │     │
  │     └─ No  → continue ↓
  │
  └─ Default: multi-provider with quotas & spend tracking
        │
        └─→  LiteLLM Proxy (+ optional Portkey for governance)
             (OpenAI-compatible, 100+ providers, Docker in 10 min)

Deployment decision: self-hosted vs. VPC vs. managed vs. hybrid vs. air-gapped

Before you pick a product, decide on the deployment model. It drives data residency, governance effort and operational complexity more than the feature list.

Start: Where are prompts & logs allowed to live?
  │
  ├─ No data may leave the data center (public sector, hospital, defense)?
  │     │
  │     └─ Yes → Air-gapped (on-prem, no internet)
  │             Candidates: OpenClaw + Ollama/vLLM, Kong AI Gateway, LiteLLM
  │             Ops: high (manual updates, own monitoring)
  │
  ├─ Data must stay in your own cloud tenant (bank, insurance, pharma)?
  │     │
  │     └─ Yes → VPC / private cloud (EU region, customer-managed keys)
  │             Candidates: AWS Strands/Bedrock AgentCore, Kong (self-hosted in VPC),
  │                         LiteLLM/Portkey in your own EKS/AKS/GKE
  │             Ops: medium (hyperscaler handles infra)
  │
  ├─ GDPR-compliant, but a mix of sensitive & generic prompts?
  │     │
  │     └─ Yes → Hybrid (managed control plane + self-hosted data plane)
  │             Candidates: Portkey Hybrid, LiteLLM + Privacy Router,
  │                         Cloudflare AI Gateway with EU R2 + local fallback
  │             Ops: medium (two layers, clear routing policies required)
  │
  ├─ Standard SaaS, EU hosting is enough, time-to-value matters?
  │     │
  │     └─ Yes → Managed (SaaS / edge)
  │             Candidates: Portkey Cloud (EU), Cloudflare AI Gateway,
  │                         AWS Bedrock (Frankfurt)
  │             Ops: low (DPA + config, no infra)
  │
  └─ Full control, IaC pipelines, in-house SRE team?
        │
        └─ Yes → Self-hosted (Docker/K8s in your own cluster)
                Candidates: LiteLLM, Portkey OSS, Kong, OpenClaw + Ollama
                Ops: high (updates, HA, secrets rotation in your hands)

Criteria matrix:

Model	Data residency	Governance	Operational complexity	Time to value
Air-gapped	100% on-prem, no internet	Maximum (no third-country transfer, no DPA needed)	Very high (updates, monitoring, HA on you)	Weeks
Self-hosted	Own cluster (EU/on-prem)	High (full auditability, own keys)	High (SRE team, IaC, patch management)	Days
VPC / private cloud	Own hyperscaler tenant (EU region)	High (CMK, IAM, CloudTrail/audit logs)	Medium (infra from hyperscaler, config on you)	Days
Hybrid	Sensitive paths local, rest managed	High (routing policies + audit on both layers)	Medium (two layers, clear classification required)	Days to weeks
Managed (SaaS/edge)	Vendor region (EU selectable)	Medium (DPA + vendor certifications)	Low (only config & keys)	Hours

💡 Rule of thumb: The stricter the data residency, the higher the operational complexity. Hybrid is the pragmatic middle ground when not all prompts are equally sensitive – PII/secrets local, generic in the cloud.

💡 Till Freitag stack recommendation: LiteLLM as multi-provider front door + Portkey as governance layer + Privacy Router for GDPR-critical paths. Once Microsoft Scout is GA, this config can be migrated with manageable effort – skills and MCP configs stay the same.

The six alternatives in detail – with enterprise workflows

LiteLLM Proxy – The OpenAI-compatible multi-provider front door

Setup: ~10 min (docker run litellm/litellm or pip install litellm). Hosting: self-hosted, EU hosting possible. 100+ LLMs behind a unified OpenAI API.