nn_timeline

Neural network architectures through time

Author

Surafel M. Lakew

Published May 14, 2026 (updated: May 29, 2026)

Note: Foundations chapters are available now. Later chapters are being added on a weekly basis.

1 Neural network architectures through time

A notes-first walk through the architectures that shaped modern deep learning — from the perceptron to the transformer. Each chapter pairs a science note (motivation, math, intuition) with a reference implementation in the nn_timeline package.

1.1 The timeline

Foundations RNN era Transformer era 1960 1970 1980 1990 2000 2010 2020 today 1958 Perceptron 1986 RNN / BPTT 1997 LSTM 2014 Seq2Seq GRU 2015 Bahdanau attn Luong attn 2017 Transformer 2018 BERT / GPT Foundations RNN family Transformer family
NoteStatic MVP — interactivity coming

This sketch is a static placeholder. The roadmap is to make it interactive and dynamic:

  • Hover / click each node to expand into mechanism + math + minimal sketch without leaving the page (better recall, less navigation friction).
  • Sub-branches for parallel innovations (e.g., attention’s split into additive vs. multiplicative; positional encoding variants; LLaMA-style stack).
  • Evolving updates — new architectures appended automatically as the registry grows; deprecated branches greyed out rather than removed so the why we moved on is preserved.
  • Cross-links between related nodes (e.g., LSTM gates ↔︎ GRU gates ↔︎ Transformer gating in SwiGLU) to make conceptual lineage visible.

1.2 How to read this book

  • Start with Foundations for the math and intuition.
  • Each later chapter assumes the previous one; cross-references point back when a derivation depends on earlier material.
  • Code references in each note point to the corresponding nn_timeline.layers.* or nn_timeline.archs.* module.

1.3 Project