The History of AI, Part 1: When Machines Learned to See and Play (2012–2017)

    The History of AI, Part 1: When Machines Learned to See and Play (2012–2017)

    15. Juni 20253 min read
    Till Freitag

    TL;DR: „The AI revolution didn't start with ChatGPT – it started in 2012 in research labs that hardly anyone knew."

    — Till Freitag

    The Starting Gun: Deep Learning Becomes Real

    The AI revolution didn't start with ChatGPT. It started quietly – in research labs and at conferences that hardly anyone outside the tech bubble knew about. But between 2012 and 2017, the foundations were laid on which everything is built today.

    2012: AlexNet and the ImageNet Moment

    In September 2012, a neural network called AlexNet won the ImageNet competition – and not by a narrow margin, but with such a dramatic lead that it shook the entire computer vision community. The error rate dropped from 26% to 16%.

    What was new? AlexNet used GPUs to train deep neural networks. What previously took weeks now took days. Deep learning was suddenly no longer theory, but practice.

    Why This Mattered

    • Proved that deep neural networks work
    • Established GPUs as training hardware
    • Triggered billions in AI research investment

    2014–2015: GANs and the Creative Machine

    Ian Goodfellow introduced Generative Adversarial Networks (GANs) in 2014 – two neural networks playing against each other. One generates images, the other evaluates them. The result: machines that appeared creative for the first time.

    The first GAN images were blurry and eerie. But the concept was groundbreaking – and laid the foundation for everything that later came with DALL-E, Midjourney, and Stable Diffusion.

    2016: AlphaGo Defeats the World Champion

    In March 2016, Google's AlphaGo defeated Go world champion Lee Sedol. This wasn't an ordinary computer victory over a human. Go was considered too complex for brute-force computation – it has more possible positions than atoms in the universe.

    AlphaGo used a combination of deep learning and reinforcement learning. In Game 2, the AI made a move (Move 37) that no human player would ever have made – and won with it. It was the moment when it became clear: AI can't just calculate, it can simulate intuition.

    "After humanity spent thousands of years refining the game of Go, the machine comes along and says: actually, you've been playing it wrong." – Fan Hui, European Go champion

    2017: Attention Is All You Need

    In June 2017, a Google team published the paper "Attention Is All You Need" – introducing the Transformer architecture. No other research paper has changed the world as much since then.

    What Makes Transformers Special?

    Before (RNNs/LSTMs)Transformer
    Sequential processingParallel processing
    Slow trainingFast training on GPUs
    Forgets in long textsAttention across the entire text
    Limited scalingScales with more data & compute

    Transformers are the architecture behind GPT, BERT, Claude, Gemini, LLaMA and virtually every modern language model. Without this paper, there would be no ChatGPT.

    What We Learn from This Era

    The years 2012–2017 were the foundational research phase. Few outside of research suspected what was brewing. But three patterns emerged:

    1. Hardware drives progress – GPUs made deep learning possible in the first place
    2. Architecture innovations change everything – AlexNet, GANs, Transformers
    3. Scaling works – more data + more compute = better results

    This insight – that you can simply "build bigger" – became the guiding idea of the years to come.


    Continue with Part 2: The Language Revolution – When Machines Learned to Read and Write (2018–2020)

    TeilenLinkedInWhatsAppE-Mail

    Related Articles

    The History of AI, Part 5: Outlook 2026 – What Comes Next?
    February 17, 20263 min

    The History of AI, Part 5: Outlook 2026 – What Comes Next?

    AGI, autonomous agents, AI-native companies: A pragmatic outlook on the AI year 2026.…

    Read more
    The History of AI, Part 4: AI Becomes Infrastructure (2024–2025)
    December 15, 20253 min

    The History of AI, Part 4: AI Becomes Infrastructure (2024–2025)

    From chatbots to agents, from text to multimodal: How AI became the infrastructure of the working world in 2024 and 2025…

    Read more
    The History of AI, Part 3: The ChatGPT Moment (2022–2023)
    October 5, 20253 min

    The History of AI, Part 3: The ChatGPT Moment (2022–2023)

    100 million users in two months: How ChatGPT, DALL-E, and GPT-4 turned the world upside down.…

    Read more
    The History of AI, Part 2: The Language Revolution (2018–2020)
    August 10, 20252 min

    The History of AI, Part 2: The Language Revolution (2018–2020)

    BERT, GPT-2, GPT-3: How machines learned language – and why it changed everything.…

    Read more
    Odysseus by PewDiePie – self-hostable AI workspace with chat, agents and documents as an alternative to ChatGPT and Claude
    June 13, 20263 min

    PewDiePie's Odysseus: The real question isn't AI sovereignty – it's the AI workplace

    PewDiePie's open-source project Odysseus hit 30,000 GitHub stars in 48 hours. The more interesting question behind it: w…

    Read more
    Abstract illustration of a central search hub connected via glowing lines to many small search-engine nodes
    June 13, 20264 min

    SearXNG: The Underrated Search Infrastructure for Agents

    31.8k stars, AGPL-3.0, one self-hosted endpoint instead of the next commercial search API. Why SearXNG keeps showing up …

    Read more
    A stylized five made of butterflies – visual for Claude Fable 5
    June 9, 20266 min

    Claude Fable 5 & Mythos 5: When AI Shifts from Tasks to Responsibilities

    Anthropic launches Claude Fable 5 and Mythos 5 – SOTA on almost all benchmarks. More interesting than the numbers: The s…

    Read more
    Stylized Mistral flame as a Mixture-of-Experts network on a dark background
    June 8, 20265 min

    Mistral 3, Large 3 & Vibe: Why the Latest Update Puts Europe's AI Hope Back in the Game

    Mistral flipped the script in six months: Mistral 3 with Large 3 (675B MoE) as open weights, Medium 3.5 as the new defau…

    Read more
    Visualization of a large pale neural network sphere and a smaller bright sphere in cyan/yellow – the shrinking frontier of open models
    June 8, 20265 min

    Nex-N2-Pro: How the Open-Model Frontier Shrunk 75 % in Six Weeks

    Six weeks ago, DeepSeek-V4-Pro with 1.6 trillion parameters was the largest open-weight model ever released. Today, Nex-…

    Read more