19  Numerical walkthrough — perceptron updates on AND and XOR

← Back to: The Artificial Neuron

TipTL;DR

Watch \(\mathbf{w}\) and \(b\) move from \((0, 0, 0)\) toward a separating boundary on AND by hand. Then watch the same arithmetic rotate forever on XOR. Same rule; different fate, because the data shape differs — not the algorithm.

19.1 Setup

\[ X = \begin{bmatrix} 0,0 \\ 0,1 \\ 1,0 \\ 1,1 \end{bmatrix}, \quad \eta = 1, \quad \text{init: } \mathbf{w} = (0,0),\ b = 0. \]

Update rule, applied per sample \((\mathbf{x}_i, y_i)\):

\[ \text{if } y_i(\mathbf{w}^\top\mathbf{x}_i + b) \le 0:\quad \mathbf{w} \mathrel{+}= y_i \mathbf{x}_i,\quad b \mathrel{+}= y_i \]

19.2 AND — converges

Labels \(y = (-1, -1, -1, +1)\).

19.2.1 Epoch 1

Sample \(y\) \(\mathbf{w}^\top\mathbf{x}+b\) \(y(\mathbf{w}^\top\mathbf{x}+b)\) Mistake? New \(\mathbf{w}\) New \(b\)
(0,0) −1 0 0 yes (0,0) −1
(0,1) −1 −1 1 no (0,0) −1
(1,0) −1 −1 1 no (0,0) −1
(1,1) +1 −1 −1 yes (1,1) 0

19.2.2 Epoch 2

Sample \(y\) \(\mathbf{w}^\top\mathbf{x}+b\) \(y(\mathbf{w}^\top\mathbf{x}+b)\) Mistake? New \(\mathbf{w}\) New \(b\)
(0,0) −1 0 0 yes (1,1) −1
(0,1) −1 0 0 yes (1,0) −2
(1,0) −1 −1 1 no (1,0) −2
(1,1) +1 −1 −1 yes (2,1) −1

By epoch ~5 the rule settles into a separator that classifies all four corners correctly. The figure below visualizes the boundary at three checkpoints.

Code
import numpy as np, matplotlib.pyplot as plt

X = np.array([[0,0],[0,1],[1,0],[1,1]], dtype=float)
y = np.array([-1,-1,-1,+1])

w = np.zeros(2); b = 0.0
snapshots = {"init (epoch 0)": (w.copy(), b)}
for epoch in range(1, 6):
    for xi, yi in zip(X, y):
        if yi * (w @ xi + b) <= 0:
            w += yi * xi
            b += yi
    if epoch in (1, 5):
        snapshots[f"end of epoch {epoch}"] = (w.copy(), b)

fig, axes = plt.subplots(1, 3, figsize=(9, 3))
for ax, (title, (ww, bb)) in zip(axes, snapshots.items()):
    for xi, yi in zip(X, y):
        c = "#d97706" if yi < 0 else "#2563eb"
        m = "o"      if yi < 0 else "^"
        ax.scatter(*xi, c=c, marker=m, s=120, edgecolor="k", zorder=3)
    if abs(ww[1]) > 1e-9:
        xs = np.array([-0.3, 1.3])
        ys = -(ww[0]*xs + bb) / ww[1]
        ax.plot(xs, ys, "k--", lw=1.5)
    ax.set_xlim(-0.3, 1.3); ax.set_ylim(-0.3, 1.3)
    ax.set_aspect("equal"); ax.grid(alpha=0.3)
    ax.set_title(f"{title}\nw={tuple(ww.astype(int))}, b={int(bb)}", fontsize=10)
plt.tight_layout(); plt.show()

Orange circles are negatives; blue triangles are the single positive. The dashed line \(\mathbf{w}^\top\mathbf{x} + b = 0\) rotates toward the misclassified positive and away from misclassified negatives, until it lands on a side of the unit square that puts (1,1) alone above and the other three below.

19.3 XOR — never converges

Same \(X\), but labels \(y = (-1, +1, +1, -1)\) — the diagonals are positive, the antidiagonals are negative.

19.3.1 Epoch 1

Sample \(y\) \(\mathbf{w}^\top\mathbf{x}+b\) \(y(\mathbf{w}^\top\mathbf{x}+b)\) Mistake? New \(\mathbf{w}\) New \(b\)
(0,0) −1 0 0 yes (0,0) −1
(0,1) +1 −1 −1 yes (0,1) 0
(1,0) +1 0 0 yes (1,1) 1
(1,1) −1 3 −3 yes (0,0) 0

After four mistakes in one epoch, \((\mathbf{w}, b)\) has returned to \((0, 0, 0)\). Subsequent epochs repeat the cycle indefinitely.

The geometry tells the same story: no straight line through the unit square can put (0,0) and (1,1) on one side while putting (0,1) and (1,0) on the other.

19.4 What this table teaches

  • The rule updates only on mistakes. Correctly classified samples leave \(\mathbf{w}, b\) untouched.
  • The boundary rotates predictably: toward misclassified positives, away from misclassified negatives.
  • Convergence depends on the data, not the algorithm. Linearly separable data → finite mistake count (Novikoff bound). Non-separable data → indefinite oscillation, often cyclic.
  • The arithmetic is concrete and small. Every step is an addition or comparison on a 2-dimensional vector. The “mystery” of learning, in this case, is just bookkeeping.

← Back to: The Artificial Neuron