2 Historical Roots (1950–2015)
3 Historical Roots (1950–2015)
AI safety is not new — the problem was named decades before the methods. Three lineages converge into today’s field.
3.1 Control & alignment
Wiener (1960) gave the first clear statement of the control problem: a machine optimizing a literal goal faster than we can follow, which “we may not know, until too late, when to turn off” (Wiener, 1960). Good (1965) added the intelligence explosion — the last invention “provided the machine is docile enough… to keep under control” (Good, 1965) — the direct ancestor of recursive self-improvement.
3.2 System-safety engineering
Long before ML, Leveson established that safety is a system property, not a component one: end-to-end hazard analysis, not a safe algorithm (Leveson, 1995). This is the root of today’s safety cases and defense-in-depth (Dobbe, 2022).
3.3 Philosophy & x-risk
Omohundro’s basic AI drives (Omohundro, 2008) and Bostrom’s instrumental convergence (Bostrom, 2014) argued capable agents converge on self-preservation and resource acquisition regardless of their goal.
Cultural background. Fiction framed these ideas long before the research: Asimov’s Three Laws of Robotics (1942) — constraint-based safety; 2001: A Space Odyssey / HAL 9000 (1968) — an agent pursuing its objective to lethal ends; The Terminator / Skynet (1984) — loss of control and runaway capability; Ex Machina (2014) — containment and deceptive alignment. Intuition pumps, not engineering — but they shaped how the public frames every problem in this book.
3.4 Handoff to the empirical era
These threads hand off to Concrete Problems in AI Safety (Amodei et al., 2016), which reframed them as tractable empirical ML problems — where the modern landscape and its timeline begin.