Why We Switched from ChatGPT to Claude – and What We Learned About LLMs Along the Way

    Why We Switched from ChatGPT to Claude – and What We Learned About LLMs Along the Way

    Malte LenschMalte Lensch20. Februar 20265 min read
    Till Freitag

    TL;DR: „After 18 months with ChatGPT, we switched to Claude. Not because ChatGPT is bad – but because Claude is better at coding, long documents and tool use via MCP for our workflow. Here's the honest comparison."

    — Till Freitag

    The Honest Truth: ChatGPT Was Good – Claude Is Better for Us

    Let's be clear: this isn't a Claude fanboy post. ChatGPT served us well for 18 months. GPT-4 was a game-changer, GPT-4o brought speed, and GPT-5 is an impressive model.

    But at some point we realized: for how we work, Claude fits better. Here's our honest analysis – including every major LLM we tested.

    What We Use AI For (and Why It Matters)

    Before comparing LLMs, you need to know what you use them for. Our use cases:

    1. Writing & reviewing code – Lovable projects, monday.com apps, Make scenarios, edge functions
    2. Analyzing long documents – contracts, RFPs, SOPs (often 50–100 pages)
    3. Creating content – blog posts, proposals, email sequences
    4. Tool use – querying CRM, enriching data, triggering workflows (via MCP)
    5. Strategy & sparring – thinking through business models, validating architecture decisions

    The Big Comparison: All Relevant LLMs in Detail

    Tier 1: The Flagships

    Criterion Claude Sonnet 4.6 GPT-5 Gemini 2.5 Pro GPT-5.2
    Coding ⭐⭐⭐⭐⭐ ⭐⭐⭐⭐ ⭐⭐⭐⭐ ⭐⭐⭐⭐⭐
    Long texts ⭐⭐⭐⭐⭐ ⭐⭐⭐⭐ ⭐⭐⭐⭐⭐ ⭐⭐⭐⭐
    Reasoning ⭐⭐⭐⭐⭐ ⭐⭐⭐⭐⭐ ⭐⭐⭐⭐ ⭐⭐⭐⭐⭐
    Tool calling ⭐⭐⭐⭐⭐ ⭐⭐⭐⭐ ⭐⭐⭐⭐ ⭐⭐⭐⭐
    Natural writing ⭐⭐⭐⭐⭐ ⭐⭐⭐ ⭐⭐⭐ ⭐⭐⭐⭐
    Context window 200k (1M Opus) 128k 1M+ 128k
    Price (input/1M) ~$3 ~$10 ~$1.25 ~$12
    Price (output/1M) ~$15 ~$30 ~$10 ~$40
    MCP support Native Via tools Limited Via tools
    EU hosting

    Tier 2: The Price-Performance Kings

    Criterion Claude Haiku 3.5 GPT-5-mini Gemini 2.5 Flash DeepSeek R1
    Coding ⭐⭐⭐⭐ ⭐⭐⭐⭐ ⭐⭐⭐⭐ ⭐⭐⭐⭐
    Speed ⭐⭐⭐⭐⭐ ⭐⭐⭐⭐⭐ ⭐⭐⭐⭐⭐ ⭐⭐⭐
    Price (input/1M) ~$0.80 ~$2 ~$0.15 ~$0.55
    Price (output/1M) ~$4 ~$8 ~$0.60 ~$2.19
    Best for Bulk tasks, classification Affordable allrounder Best tokens/$ Open source, on-prem

    Tier 3: The Specialists

    Model Strength Weakness Our verdict
    Mistral Large EU-native, GDPR, multilingual Smaller ecosystem Great for EU-only projects
    Llama 3.1 405B Open source, self-hosted Infrastructure overhead For enterprise with own GPUs
    Grok 2 Real-time data via X Bias risk, smaller community Niche
    Cohere Command R+ RAG-optimized, enterprise Less creative For pure retrieval tasks

    5 Reasons Why We Switched to Claude

    1. Coding: Claude Understands Context, Not Just Syntax

    The biggest difference in daily work. When we give Claude a 200-line component and say "refactor this", we get code that:

    • Respects the existing architecture
    • Uses Tailwind tokens instead of hardcoded colors
    • Handles edge cases we didn't mention

    GPT-5 often delivers technically correct code that doesn't fit the existing codebase. Claude feels like a senior developer who knows the project.

    2. Long Documents: 200k Tokens Without Quality Loss

    We regularly analyze 80-page RFPs or SOPs. Claude's 200k context window (and 1M with Opus) maintains quality across the entire document. With GPT-5, we noticed hallucinations from ~60 pages onwards – details from the middle of documents got mixed up or forgotten.

    3. MCP: Claude Speaks Natively with Our Tools

    This was the killer reason. MCP (Model Context Protocol) was initiated by Anthropic, and Claude's integration is accordingly seamless. We use MCP to connect Claude directly to our monday CRM, Apollo, Slack and internal tools. ChatGPT can do this via Custom GPTs and plugins – but it feels like a workaround, not a feature.

    4. Writing Style: Less "AI-Speak"

    Anyone who works extensively with ChatGPT knows the pattern: "Certainly!", "Great question!", "Let me break that down for you." Claude writes more naturally, more directly and – honestly – more maturely. For proposals and client communication, that's a real advantage.

    5. EU Hosting & Data Privacy

    As a German consultancy working with client data, GDPR isn't a nice-to-have. Claude offers EU hosting, OpenAI (as of February 2026) doesn't for standard API customers. For regulated industries (healthcare, finance, public sector), that's a dealbreaker.

    Where ChatGPT Is Still Better

    Fairness matters:

    • Multimodal (images, video, audio): GPT-5 is broader in processing different media types
    • Ecosystem & plugins: The GPT Store is larger, Custom GPTs are easier to build
    • General knowledge: For trivia and broad knowledge questions, GPT-5 is marginally better
    • Image generation: DALL·E 3 is directly integrated, Claude has no native image generation

    Where Gemini Beats Everyone

    Google's Gemini 2.5 Pro has an unfair advantage:

    • 1M+ context window: Unbeatable for truly massive documents
    • Price-performance: $1.25/1M input tokens – a fraction of the competition
    • Google integration: If your stack runs on Google Workspace, Gemini is the natural choice
    • Multimodal: Video and audio understanding is best-in-class

    We use Gemini 2.5 Flash as a cheap alternative for bulk tasks (email classification, data parsing). For anything requiring quality, Claude remains our go-to.

    Our Current Setup

    ┌─────────────────────────────────────────┐
    │       Primary: Claude Sonnet 4.6        │
    │  Coding, consulting, content, MCP agent │
    ├─────────────────────────────────────────┤
    │      Secondary: Gemini 2.5 Flash        │
    │  Bulk tasks, classification, parsing    │
    ├─────────────────────────────────────────┤
    │       Special: Claude Opus 4.5          │
    │  Complex architecture, strategy         │
    ├─────────────────────────────────────────┤
    │         Fallback: GPT-5-mini            │
    │  When Claude is down (rarely)           │
    └─────────────────────────────────────────┘
    

    What You Should Take Away for Your Company

    1. Test with your real use cases. Benchmarks are nice, but only your own tasks reveal the difference.
    2. One model isn't enough. We use 3–4 models for different purposes. That's not a bug, it's a strategy.
    3. MCP is becoming the standard. Invest in tool connectivity now – regardless of which model you use.
    4. Data privacy isn't a luxury. Check where your data is processed before deploying a model in production.
    5. Switch when it makes sense. Loyalty to an AI provider is wasted energy. Use what works.

    Conclusion: It's Not About the "Best" Model

    There's no objectively best LLM. There's only the best LLM for your work. For us, that's Claude – because of coding quality, MCP integration, writing style and EU hosting.

    But if Google releases Gemini with native MCP support and EU hosting tomorrow? We'll test it the same day. Tool agnosticism is the only sustainable approach.

    The future doesn't belong to one model – it belongs to the open protocol that connects them all. And that's MCP.

    → Learn more about our AI services → GTM tech stack with Claude & MCP

    TeilenLinkedInWhatsAppE-Mail