The History of AI, Part 2: The Language Revolution (2018–2020)

    The History of AI, Part 2: The Language Revolution (2018–2020)

    Till FreitagTill Freitag10. August 20252 min Lesezeit
    Till Freitag

    TL;DR: „BERT and GPT showed two paths – but both proved: machines can understand and generate language."

    — Till Freitag

    The Transformer Architecture Unleashed

    After the Transformer architecture was introduced in 2017, a race began. Two approaches emerged – and both fundamentally changed the AI world.

    2018: BERT – Google Understands Context

    In October 2018, Google released BERT (Bidirectional Encoder Representations from Transformers). The trick: BERT reads text in both directions simultaneously and thereby understands context better than anything before it.

    An Example

    The sentence: "I went to the bank to deposit my check."

    • Before: Models struggled with whether "bank" meant a financial institution or a river bank
    • BERT: Understands through context ("deposit," "check") that it's about a financial institution

    Google integrated BERT directly into Search – the biggest algorithm leap in years. Suddenly Google understood what you mean, not just what you type.

    2019: GPT-2 – "Too Dangerous to Release"

    OpenAI released GPT-2 in February 2019 – but only partially. They initially held back the full model, reasoning: too dangerous for the public. The fear: mass-generated fake content.

    GPT-2 could write astonishingly coherent texts. Entire news articles, stories, even simple programming tasks. 1.5 billion parameters – unimaginably large at the time.

    The Debate Begins

    The GPT-2 controversy marked the beginning of a discussion that continues to this day:

    • Safety vs. Openness – Who decides what's "too dangerous"?
    • Dual Use – Every AI capability can be useful or harmful
    • Developer Responsibility – OpenAI became the center of this debate

    2020: GPT-3 – The Paradigm Shift

    In June 2020, GPT-3 appeared with 175 billion parameters – over 100x larger than GPT-2. And suddenly it became clear: scaling alone produces emergent capabilities.

    GPT-3 could do things that nobody had explicitly trained it to do:

    • Write programming code
    • Translate between languages
    • Solve mathematical problems
    • Compose creative texts in various styles
    • Learn from just a few examples (few-shot learning)

    The Scaling Hypothesis

    Model Parameters Year Capabilities
    GPT-1 117M 2018 Simple text completion
    GPT-2 1.5B 2019 Coherent paragraphs
    GPT-3 175B 2020 Code, translation, reasoning

    The message was clear: More parameters = more capabilities. The so-called scaling hypothesis became the driving force of the entire industry.

    GitHub Copilot – AI Becomes a Tool

    At the end of 2020, development of GitHub Copilot began, based on GPT-3 (later Codex). For the first time, a large language model was directly integrated into a product that millions of people use daily.

    Copilot showed: AI is no longer a future concept. It sits in your editor and writes code with you.

    What We Learn from This Era

    The years 2018–2020 brought three fundamental insights:

    1. Language is the key – Whoever masters language can master almost anything
    2. Scaling works – Larger models can do qualitatively new things
    3. AI becomes product – From research to everyday work

    But the truly big things were still to come.


    Continue with Part 3: The ChatGPT Moment – AI Reaches the World (2022–2023)

    TeilenLinkedInWhatsAppE-Mail

    Verwandte Artikel

    The History of AI, Part 5: Outlook 2026 – What Comes Next?
    17. Februar 20263 min

    The History of AI, Part 5: Outlook 2026 – What Comes Next?

    AGI, autonomous agents, AI-native companies: A pragmatic outlook on the AI year 2026.…

    Weiterlesen
    The History of AI, Part 4: AI Becomes Infrastructure (2024–2025)
    15. Dezember 20253 min

    The History of AI, Part 4: AI Becomes Infrastructure (2024–2025)

    From chatbots to agents, from text to multimodal: How AI became the infrastructure of the working world in 2024 and 2025…

    Weiterlesen
    The History of AI, Part 3: The ChatGPT Moment (2022–2023)
    5. Oktober 20253 min

    The History of AI, Part 3: The ChatGPT Moment (2022–2023)

    100 million users in two months: How ChatGPT, DALL-E, and GPT-4 turned the world upside down.…

    Weiterlesen
    The History of AI, Part 1: When Machines Learned to See and Play (2012–2017)
    15. Juni 20253 min

    The History of AI, Part 1: When Machines Learned to See and Play (2012–2017)

    From AlexNet to AlphaGo to the Transformer paper: How the foundations were laid that are changing everything today.…

    Weiterlesen
    BullshitBench – Which AI Detects Nonsense?
    9. Juli 20254 min

    BullshitBench – Which AI Detects Nonsense?

    BullshitBench tests whether AI models detect plausible-sounding nonsense – or just go along with it. The results are sur…

    Weiterlesen
    AI Readiness Check: Why 90% of Companies Aren't Ready for AI – And How You Fix That in One Day
    16. März 20265 min

    AI Readiness Check: Why 90% of Companies Aren't Ready for AI – And How You Fix That in One Day

    Most companies talk about AI – but almost none know where they actually stand. An AI Readiness Check reveals in one day …

    Weiterlesen
    4 Out of 5 Employees Have No AI Access. monday.com Fixes That.
    15. März 20265 min

    4 Out of 5 Employees Have No AI Access. monday.com Fixes That.

    80% of employees use AI on their own – because their employer offers no solution. monday.com AI ends Shadow AI and gives…

    Weiterlesen
    We Are the Anti-McKinsey for AI.
    14. März 20264 min

    We Are the Anti-McKinsey for AI.

    McKinsey sells you an AI strategy on 200 slides. We build it – in half the time, at a fraction of the cost. Why the futu…

    Weiterlesen
    Hunter Alpha: The Largest Free AI Model Ever – Is DeepSeek V4 Behind It?
    13. März 20264 min

    Hunter Alpha: The Largest Free AI Model Ever – Is DeepSeek V4 Behind It?

    1 trillion parameters, 1 million token context, completely free – Hunter Alpha is the largest AI model ever released. We…

    Weiterlesen