Self-Hosted & Privacy Layer 2026: Ontheia, Anything LLM & Privacy Router

4. Juni 20264 min Lesezeit

TL;DR: „GDPR-compliant AI is no longer theory in 2026. Ontheia + Privacy Router + Ollama deliver the full local-AI-first stack today, without hyperscalers. Once RTX Spark ships, this becomes the default for mid-market and enterprise."

— Till Freitag

Why self-host at all?

Three drivers push mid-market and enterprise to self-hosting in 2026:

GDPR & Schrems II – prompts with customer data must not flow to the US.
Industry regulation – BaFin, KRITIS, pharma, public sector demand on-premise or at least EU sovereignty.
Cost control – equipping 100+ employees with LLMs quickly costs five figures per month at Anthropic/OpenAI.

Self-hosting doesn't mean "buy an LLM and forget about it". It means cleanly separating runtime (agent), routing layer (privacy-aware), model layer (local LLM). These three building blocks are what we look at.

The runtime tier: Ontheia, Anything LLM, NanoClaw

Ontheia – The EU-native open-source runtime

TypeScript, Docker, AGPL-3.0. Speaks Anthropic, OpenAI, Gemini, Grok and Ollama out of the box. Setup: 15 minutes (Docker Compose).

Typical workflow: Ontheia as the central agent runtime in your own data center. Users chat via your own web frontend, skills run on your own infrastructure, data never leaves the house.
Best for: EU mid-market companies that want an OpenClaw-like experience without an Anthropic dependency.
Strength: AGPL forces forks to stay open – predictable roadmap, no embrace-extend-extinguish risk.

Anything LLM – The all-in-one hub

34,000+ stars. RAG, multi-LLM, workspace concept, browser UI, desktop app. Setup: 20 minutes (Docker or desktop installer).

Typical workflow: workspace per department, own document collections, each workspace binds its own LLM (Ollama, Anthropic, Mistral). RAG built in – drop in a PDF, ask a question, get an answer with sources.
Best for: knowledge work, internal knowledge management, onboarding assistants.
Strength: lowest barrier of all self-hosting options. Runnable even without a DevOps team.

NanoClaw – The security-focused OpenClaw clone

8,400+ stars, container isolation per skill, WhatsApp integration. Setup: 30 minutes (Docker Compose + skill config).

Typical workflow: like OpenClaw, but every skill runs in its own container with least-privilege networking. Ideal for risky skills (browser automation, code execution).
Best for: teams that want OpenClaw power but need to shrink its attack surface.
Strength: security by design instead of security patch.

The routing layer: Privacy Router

The Privacy Router is our own open-source tool. It sits between runtime and LLM and decides per request which model answers:

Sensitive prompt (person names, IBAN, medical data) → local model (Ollama, vLLM).
Generic prompt → cheap cloud model (Haiku, Mini).
Complex reasoning prompt without PII → best cloud model (Sonnet, GPT).

Setup: 10 minutes. Configuration as YAML, rules via RegEx + ML classifier.

Typical workflow: runtime calls Privacy Router instead of OpenAI/Anthropic directly. Router classifies, routes, logs – audit trail included.
Best for: hybrid stacks that need to combine cost optimization and GDPR.

The model layer: Ollama, vLLM, llama.cpp

Ollama – zero barrier. ollama run mistral and done. Best for: laptops, single user, prototypes.
vLLM – production-grade. Paged attention, high throughput, OpenAI-compatible API. Best for: central GPU servers, multi-user workloads.
llama.cpp – maximally portable. Runs on Apple Silicon, CPU, embedded devices. Best for: edge scenarios.

Hardware layer (announced): NVIDIA RTX Spark

The announced RTX Spark is set to deliver 1,700 tokens/s – enough to run 30B models for a 50-person team at acceptable latency. Status: announced, not yet available. Today, bridge with RTX 6000 Ada, H100 or Apple M Studio clusters.

Quick-Select: which self-hosting stack for which profile?

Profile	Recommendation	Why
Fastest start	Anything LLM Desktop + Ollama	One-click installer, RAG included
Highest privacy control	Ontheia + Privacy Router + vLLM	Fully on-premise, deterministic routing
Best overall package	NanoClaw + Privacy Router + Ollama	Container isolation, hybrid model mix
Edge / embedded	llama.cpp + custom runtime	Runs on any device, no server needed

Typical workflows by use case

GDPR-compliant internal knowledge assistant: Anything LLM + Ollama (Mistral 7B) on a workstation PC. Documents stay in-house, answers with sources.
Hybrid stack with cost optimization: Ontheia → Privacy Router → (Ollama for PII | Claude Haiku for generic | Claude Sonnet for complex). Saves 60–80% cloud cost at full compliance.
High-risk skill (browser automation): NanoClaw with container isolation. Skill may only hit one domain, no filesystem access, network egress logged.
Edge deployment (machine, vehicle, kiosk): llama.cpp + small 3B model. Works offline, zero cloud risk.
Pilot without IT budget: Anything LLM Desktop, locally on a MacBook M3 with Ollama. Productive in 30 minutes.

Till Freitag recommendation

Start today: Anything LLM + Ollama on a decent workstation. When the pilot is live: migrate to Ontheia + Privacy Router + vLLM in your own data center. Once RTX Spark ships: hardware refresh – then local-AI-first becomes feasible for 50- to 200-person teams without latency compromises.

The full market overview lives in the master article: The best OpenClaw alternatives 2026. Hands-on step-by-step in the self-hosting GDPR guide.

More on this topic: Coding-Agent Layer · Multi-Agent Layer · Enterprise Gateway Layer · Privacy Router Guide · Master article

TeilenLinkedIn WhatsApp E-Mail

Verwandte Artikel

The Best OpenClaw Alternatives 2026 – from NanoClaw to NullClaw

Deep Dive

21. Februar 202622 min

The Best OpenClaw Alternatives 2026 – from NanoClaw to NullClaw

OpenClaw has 200,000+ GitHub stars – but not everyone needs 430,000 lines of code. We compare 22 alternatives in mid-202…

4. Juni 20264 min

Coding-Agent Layer 2026: OpenCode, Aider, Continue.dev & Co. Compared

Deep dive into the coding-agent layer: which OpenClaw coding rival fits which workflow? OpenCode, Aider, Continue.dev, S…

Enterprise Gateway Layer 2026: LiteLLM, Portkey, Cloudflare, Kong, AWS Strands & Privacy Router

Deep Dive

4. Juni 202611 min

Enterprise Gateway Layer 2026: LiteLLM, Portkey, Cloudflare, Kong, AWS Strands & Privacy Router

Enterprises need an LLM gateway today – Microsoft Scout is only announced. LiteLLM, Portkey, Cloudflare AI Gateway, Kong…

4. Juni 20264 min

Multi-Agent Layer 2026: AG2, LangGraph, SuperAGI & AWS Strands Compared

When one agent isn't enough: AG2, LangGraph, SuperAGI and AWS Strands compared. Which multi-agent stack fits which workf…

Open-Source LLMs Compared 2026 – 25+ Models You Should Know

Deep Dive

7. März 202610 min

Open-Source LLMs Compared 2026 – 25+ Models You Should Know

From Llama to Qwen to Gemma 4: all major open-source LLMs at a glance – with GitHub stars, parameters, licenses, and cle…

Deep Dive

7. März 20269 min

Open-Source LLMs Compared 2026 – 25+ Models You Should Know

From Llama to Qwen to Gemma 4: Every major open-source LLM at a glance – with GitHub stars, parameters, licenses, and cl…

28. Februar 20264 min

OpenClaw Self-Hosting Guide: GDPR-Compliant in 30 Minutes

Self-host OpenClaw with Docker, persistent storage, and local LLMs via Ollama – fully GDPR-compliant because no data eve…

21. Februar 20264 min

NanoClaw: The Lean Successor to OpenClaw – An AI Agent That Fits in Your Pocket

NanoClaw is the minimalist successor to OpenClaw – an AI agent that runs on a Raspberry Pi, is controllable via WhatsApp…

Paperclip control plane showing an org chart of AI agents with CEO, managers, workers, approval gates and budget tracking

28. April 20266 min

Paperclip: If OpenClaw Is the Employee, Paperclip Is the Company

Paperclip is open-source infrastructure to run an entire AI-only company – org chart, budgets, approvals, audit trail. W…