Self-Hosted & Privacy Layer 2026: Ontheia, Anything LLM & Privacy Router

    Self-Hosted & Privacy Layer 2026: Ontheia, Anything LLM & Privacy Router

    4. Juni 20264 min Lesezeit
    Till Freitag

    TL;DR: „GDPR-compliant AI is no longer theory in 2026. Ontheia + Privacy Router + Ollama deliver the full local-AI-first stack today, without hyperscalers. Once RTX Spark ships, this becomes the default for mid-market and enterprise."

    — Till Freitag

    Why self-host at all?

    Three drivers push mid-market and enterprise to self-hosting in 2026:

    1. GDPR & Schrems II – prompts with customer data must not flow to the US.
    2. Industry regulation – BaFin, KRITIS, pharma, public sector demand on-premise or at least EU sovereignty.
    3. Cost control – equipping 100+ employees with LLMs quickly costs five figures per month at Anthropic/OpenAI.

    Self-hosting doesn't mean "buy an LLM and forget about it". It means cleanly separating runtime (agent), routing layer (privacy-aware), model layer (local LLM). These three building blocks are what we look at.

    The runtime tier: Ontheia, Anything LLM, NanoClaw

    Ontheia – The EU-native open-source runtime

    TypeScript, Docker, AGPL-3.0. Speaks Anthropic, OpenAI, Gemini, Grok and Ollama out of the box. Setup: 15 minutes (Docker Compose).

    • Typical workflow: Ontheia as the central agent runtime in your own data center. Users chat via your own web frontend, skills run on your own infrastructure, data never leaves the house.
    • Best for: EU mid-market companies that want an OpenClaw-like experience without an Anthropic dependency.
    • Strength: AGPL forces forks to stay open – predictable roadmap, no embrace-extend-extinguish risk.

    Anything LLM – The all-in-one hub

    34,000+ stars. RAG, multi-LLM, workspace concept, browser UI, desktop app. Setup: 20 minutes (Docker or desktop installer).

    • Typical workflow: workspace per department, own document collections, each workspace binds its own LLM (Ollama, Anthropic, Mistral). RAG built in – drop in a PDF, ask a question, get an answer with sources.
    • Best for: knowledge work, internal knowledge management, onboarding assistants.
    • Strength: lowest barrier of all self-hosting options. Runnable even without a DevOps team.

    NanoClaw – The security-focused OpenClaw clone

    8,400+ stars, container isolation per skill, WhatsApp integration. Setup: 30 minutes (Docker Compose + skill config).

    • Typical workflow: like OpenClaw, but every skill runs in its own container with least-privilege networking. Ideal for risky skills (browser automation, code execution).
    • Best for: teams that want OpenClaw power but need to shrink its attack surface.
    • Strength: security by design instead of security patch.

    The routing layer: Privacy Router

    The Privacy Router is our own open-source tool. It sits between runtime and LLM and decides per request which model answers:

    • Sensitive prompt (person names, IBAN, medical data) → local model (Ollama, vLLM).
    • Generic prompt → cheap cloud model (Haiku, Mini).
    • Complex reasoning prompt without PII → best cloud model (Sonnet, GPT).

    Setup: 10 minutes. Configuration as YAML, rules via RegEx + ML classifier.

    • Typical workflow: runtime calls Privacy Router instead of OpenAI/Anthropic directly. Router classifies, routes, logs – audit trail included.
    • Best for: hybrid stacks that need to combine cost optimization and GDPR.

    The model layer: Ollama, vLLM, llama.cpp

    • Ollama – zero barrier. ollama run mistral and done. Best for: laptops, single user, prototypes.
    • vLLM – production-grade. Paged attention, high throughput, OpenAI-compatible API. Best for: central GPU servers, multi-user workloads.
    • llama.cpp – maximally portable. Runs on Apple Silicon, CPU, embedded devices. Best for: edge scenarios.

    Hardware layer (announced): NVIDIA RTX Spark

    The announced RTX Spark is set to deliver 1,700 tokens/s – enough to run 30B models for a 50-person team at acceptable latency. Status: announced, not yet available. Today, bridge with RTX 6000 Ada, H100 or Apple M Studio clusters.

    Quick-Select: which self-hosting stack for which profile?

    Profile Recommendation Why
    Fastest start Anything LLM Desktop + Ollama One-click installer, RAG included
    Highest privacy control Ontheia + Privacy Router + vLLM Fully on-premise, deterministic routing
    Best overall package NanoClaw + Privacy Router + Ollama Container isolation, hybrid model mix
    Edge / embedded llama.cpp + custom runtime Runs on any device, no server needed

    Typical workflows by use case

    • GDPR-compliant internal knowledge assistant: Anything LLM + Ollama (Mistral 7B) on a workstation PC. Documents stay in-house, answers with sources.
    • Hybrid stack with cost optimization: Ontheia → Privacy Router → (Ollama for PII | Claude Haiku for generic | Claude Sonnet for complex). Saves 60–80% cloud cost at full compliance.
    • High-risk skill (browser automation): NanoClaw with container isolation. Skill may only hit one domain, no filesystem access, network egress logged.
    • Edge deployment (machine, vehicle, kiosk): llama.cpp + small 3B model. Works offline, zero cloud risk.
    • Pilot without IT budget: Anything LLM Desktop, locally on a MacBook M3 with Ollama. Productive in 30 minutes.

    Till Freitag recommendation

    Start today: Anything LLM + Ollama on a decent workstation. When the pilot is live: migrate to Ontheia + Privacy Router + vLLM in your own data center. Once RTX Spark ships: hardware refresh – then local-AI-first becomes feasible for 50- to 200-person teams without latency compromises.

    The full market overview lives in the master article: The best OpenClaw alternatives 2026. Hands-on step-by-step in the self-hosting GDPR guide.

    More on this topic: Coding-Agent Layer · Multi-Agent Layer · Enterprise Gateway Layer · Privacy Router Guide · Master article

    TeilenLinkedInWhatsAppE-Mail

    Verwandte Artikel

    The Best OpenClaw Alternatives 2026 – from NanoClaw to NullClawDeep Dive
    21. Februar 202622 min

    The Best OpenClaw Alternatives 2026 – from NanoClaw to NullClaw

    OpenClaw has 200,000+ GitHub stars – but not everyone needs 430,000 lines of code. We compare 22 alternatives in mid-202…

    Weiterlesen
    Coding-Agent Layer 2026: OpenCode, Aider, Continue.dev & Co. Compared
    4. Juni 20264 min

    Coding-Agent Layer 2026: OpenCode, Aider, Continue.dev & Co. Compared

    Deep dive into the coding-agent layer: which OpenClaw coding rival fits which workflow? OpenCode, Aider, Continue.dev, S…

    Weiterlesen
    Enterprise Gateway Layer 2026: LiteLLM, Portkey, Cloudflare, Kong, AWS Strands & Privacy RouterDeep Dive
    4. Juni 202611 min

    Enterprise Gateway Layer 2026: LiteLLM, Portkey, Cloudflare, Kong, AWS Strands & Privacy Router

    Enterprises need an LLM gateway today – Microsoft Scout is only announced. LiteLLM, Portkey, Cloudflare AI Gateway, Kong…

    Weiterlesen
    Multi-Agent Layer 2026: AG2, LangGraph, SuperAGI & AWS Strands Compared
    4. Juni 20264 min

    Multi-Agent Layer 2026: AG2, LangGraph, SuperAGI & AWS Strands Compared

    When one agent isn't enough: AG2, LangGraph, SuperAGI and AWS Strands compared. Which multi-agent stack fits which workf…

    Weiterlesen
    Open-Source LLMs Compared 2026 – 25+ Models You Should KnowDeep Dive
    7. März 202610 min

    Open-Source LLMs Compared 2026 – 25+ Models You Should Know

    From Llama to Qwen to Gemma 4: all major open-source LLMs at a glance – with GitHub stars, parameters, licenses, and cle…

    Weiterlesen
    Open-Source LLMs Compared 2026 – 25+ Models You Should KnowDeep Dive
    7. März 20269 min

    Open-Source LLMs Compared 2026 – 25+ Models You Should Know

    From Llama to Qwen to Gemma 4: Every major open-source LLM at a glance – with GitHub stars, parameters, licenses, and cl…

    Weiterlesen
    OpenClaw Self-Hosting Guide: GDPR-Compliant in 30 Minutes
    28. Februar 20264 min

    OpenClaw Self-Hosting Guide: GDPR-Compliant in 30 Minutes

    Self-host OpenClaw with Docker, persistent storage, and local LLMs via Ollama – fully GDPR-compliant because no data eve…

    Weiterlesen
    NanoClaw: The Lean Successor to OpenClaw – An AI Agent That Fits in Your Pocket
    21. Februar 20264 min

    NanoClaw: The Lean Successor to OpenClaw – An AI Agent That Fits in Your Pocket

    NanoClaw is the minimalist successor to OpenClaw – an AI agent that runs on a Raspberry Pi, is controllable via WhatsApp…

    Weiterlesen
    Paperclip control plane showing an org chart of AI agents with CEO, managers, workers, approval gates and budget tracking
    28. April 20266 min

    Paperclip: If OpenClaw Is the Employee, Paperclip Is the Company

    Paperclip is open-source infrastructure to run an entire AI-only company – org chart, budgets, approvals, audit trail. W…

    Weiterlesen