Origins of AI and Neural Networks
Overview
Artificial Intelligence (AI) didn’t appear overnight. It’s the result of decades of theoretical exploration, failed attempts, slow breakthroughs, and exponential progress. This lesson traces the origins of AI, with a focus on neural networks, the architecture behind much of today’s generative and predictive power.
1. The Birth of the Idea (1940s–1950s)
Alan Turing (1950) – The Turing Test
- In “Computing Machinery and Intelligence,” Turing asked, “Can machines think?”
- He proposed the Imitation Game, now known as the Turing Test, which is now the benchmark for machine intelligence.
McCulloch & Pitts (1943) – First Neural Network Model
- Proposed a simplified model of the brain using binary thresholds.
- Their artificial neuron could simulate logic.
Donald Hebb (1949) – Hebbian Learning
- Introduced the concept that “neurons that fire together wire together.”
- This became the basis for unsupervised learning rules.
2. Early AI Hype and the First Winter (1956–1970s)
Dartmouth Conference (1956) – Birth of AI as a Field
- Organized by John McCarthy, Marvin Minsky, Nathaniel Rochester, and Claude Shannon.
- Set the tone for decades: lofty promises and underwhelming delivery.
Perceptron (1958) – Frank Rosenblatt
- A single-layer neural network for binary classification.
- Briefly funded by the U.S. Navy, hyped as a path to general intelligence.
AI Winter I (1973–1980)
- Minsky and Papert’s book “Perceptrons” showed that single-layer models couldn’t solve XOR problems.
3. Neural Networks Reemerge (1980s–1990s)
Backpropagation (1986) – Rumelhart, Hinton, Williams
- A training method that allowed multi-layer networks to learn.
- Solved the XOR problem and revived interest in deep learning.
Hopfield Networks (1982) and Boltzmann Machines (1985)
- Introduced concepts of recurrent neural networks and energy-based models.
AI Winter II (Late 1980s–1990s)
- Neural nets were computationally expensive and underperforming.
- Symbolic AI dominated (expert systems, rule-based logic).
4. The Deep Learning Era (2006–Present)
Geoffrey Hinton’s Breakthrough (2006)
- Published a paper on training deep belief networks.
- This kicked off the modern deep learning boom.
ImageNet Moment (2012) – AlexNet
- A deep convolutional neural network outperformed all rivals in the ImageNet competition.
- Trained on GPUs. Massive leap in computer vision.
Transformer Architecture (2017) – “Attention Is All You Need”
- Introduced by Vaswani et al. at Google.
- Replaced recurrence with self-attention.
- The core of modern large language models (e.g., GPT, BERT).
GPT Series (2018–2023)
- OpenAI’s Generative Pretrained Transformers:
- GPT-1 (2018): Proof of concept.
- GPT-2 (2019): Withheld at first due to “misuse concerns.”
- GPT-3 (2020): 175B parameters, zero-shot tasks.
- GPT-4 (2023): Multimodal reasoning, more reliable outputs.
5. Where We Are Now
Neural networks now run:
- Recommendation systems
- Autonomous vehicles
- Voice assistants
- Image & video synthesis (e.g., Stable Diffusion, RunwayML)
- Generative models (ChatGPT, Claude, Gemini)