Lesson 00 · Prerequisites
Eigenvectors, eigenvalues and the Jacobian — with the right pictures
The two ideas from algebra that hold up the whole notebook: the "own" directions of a transformation, and its local linear imitation. Each concept comes with a famous metaphor to keep in your pocket.
Eigenvectors and eigenvalues: the rails of the transformation
A matrix \(A\) transforms the entire plane in one stroke: every vector gets moved, rotated, stretched. The question that defines eigenvectors is: is there any direction the transformation respects? That is, a vector \(\mathbf v\) that doesn't get deviated, but only slides along its own line:
$$A\,\mathbf v \;=\; \lambda\,\mathbf v$$Two objects, two different questions — this is the point that often gets muddled:
Three famous pictures
1 · The Earth's axis. The Earth's rotation moves everything: Rome travels, New York travels, the whole ocean travels. The only direction that goes nowhere is the axis: the poles stay put. The axis is the eigenvector of the rotation matrix, with eigenvalue \(\lambda = 1\) (it doesn't even stretch). It's a theorem by Euler: every 3D rotation has an axis — translated: every 3D rotation matrix has a real eigenvector with \(\lambda=1\). Not so in 2D: a rotation of the plane leaves no direction fixed (see below).
2 · Dough under the rolling pin. Roll the dough out in one direction: it stretches ×2 along the rolling pin and shrinks ×0.7 in the perpendicular one. If you had drawn lots of little arrows on the dough beforehand, almost all of them rotate as you roll (they get dragged toward the stretch direction); the only ones that don't rotate are the one along the rolling pin and the perpendicular one. Those two directions are the eigenvectors; 2 and 0.7 are the eigenvalues.
3 · The sheared Mona Lisa. The most famous figure on Wikipedia about this topic: apply a shear to the painting and everything tilts — except the vector along the base, which stays exactly where it was (\(\lambda=1\)). Fun fact: in a shear there is only one rail (a "defective" matrix: two equal eigenvalues, a single independent eigenvector) — try the preset below.
Eigen in German means "own, characteristic": eigenvectors are the transformation's own directions, the ones that belong to it.
The λ bestiary
| Eigenvalue | What it does to its rail | Picture |
|---|---|---|
| \(\lambda > 1\) | stretches | the rolling pin |
| \(0 < \lambda < 1\) | compresses | the spring shortening |
| \(\lambda < 0\) | flips the direction (the line stays put) | the rubber band snapping to the other side |
| \(\lambda = 0\) | annihilates the direction (everything to the origin) | the steamroller — and indeed \(\det A = 0\), not invertible |
| complex \(\alpha \pm \beta i\) | no real rail: everything rotates | the 2D rotation — every direction turns |
A bonus to cement the "eigen" family: PCA takes the eigenvectors of the covariance of a swarm of points (the axes of the swarm); PageRank is the dominant eigenvector of the web graph — Google built an empire on it; a crystal glass shatters at its note because it has eigenfrequencies. Every time you hear "eigen-something", the underlying question is always the same: what stays itself under this operation?
The Jacobian: the local flat map
One-line refresher: the derivative is a zoom. A smooth function, magnified enough around a point, becomes indistinguishable from its tangent line. The Jacobian is the exact same idea for \(f:\mathbb R^n \to \mathbb R^m\): zoom in enough on the deformation around a point \(\mathbf p\) and it becomes indistinguishable from a linear map. That linear map is \(J(\mathbf p)\):
$$J_{ij} \;=\; \frac{\partial f_i}{\partial x_j} \qquad\qquad f(\mathbf p + \boldsymbol\delta) \;\approx\; f(\mathbf p) + J(\mathbf p)\,\boldsymbol\delta$$Row \(i\) lists how strongly output \(i\) reacts to each input: the Jacobian is the sensitivity table of \(f\) at that point. But the pictures that stick are these two:
1 · The neighborhood map. Projecting the (curved) Earth onto a (flat) sheet is a nonlinear map — the perfect world map doesn't exist. Yet the map of your neighborhood works just fine: at small scale the projection is approximated beautifully by a linear map. Every point on the planet has its own "local flat map", and that map is the Jacobian of the projection at that point.
2 · The funhouse mirror. Your full figure is monstrous: short legs, pear-shaped head — a decidedly nonlinear transformation. But a postage stamp of skin, a mole and its surroundings, is simply stretched, rotated and sheared: an affine transformation. The funhouse mirror is a collection of Jacobians, one for each point of your silhouette.
The determinant: how much areas inflate
For a linear map the determinant is the area factor (in 3D: the volume factor): take the unit square, apply \(A\), you get a parallelogram; \(|\det A|\) is its area. Rolling pin ×2 in one direction and ×1.5 in the other → every little patch of area ×3. In Fig. 1 it is the parallelogram with the "F" inside, and its value is in the readout.
With the rails picture the fact that \(\det A = \lambda_1 \lambda_2 \cdots\) becomes obvious: you stretch ×\(\lambda_1\) along one rail and ×\(\lambda_2\) along the other, so the area goes ×\(\lambda_1\lambda_2\). And the two pathological cases have a precise physics:
| Case | Picture | Meaning |
|---|---|---|
| \(\det = 0\) | the steamroller: it squashes the plane onto a line (one eigenvalue is 0) | not invertible — you can't un-squash a crêpe: two points that ended up in the same place can no longer be told apart |
| \(\det < 0\) | the glove turned inside out: the right hand in the mirror becomes a left one | the area is still there (\(|\det|\)) but the orientation has flipped — in Fig. 1 the "F" mirrors itself with the "Reflection" preset |
The determinant of the Jacobian: local area factor
If the Jacobian is the local linear map, its determinant is the local area factor — a different number at every point, telling you how much the map inflates or compresses the little areas there. Three famous anchors:
1 · Mercator's Greenland. On the Mercator world map Greenland looks as big as Africa; in reality it is about 14 times smaller. The reason is that the \(\det J\) of the Mercator projection blows up approaching the poles. The Tissot ellipses printed in atlases are exactly Fig. 2 applied to the world map: equal little circles on the sphere, ever more inflated ellipses on the chart.
2 · The pizza slice. In the change to polar coordinates, \(dx\,dy = r\,dr\,d\theta\): that \(r\) is \(|\det J|\). The picture: a "little square" \(\Delta r \times \Delta\theta\) near the edge of the pizza is much bigger than the same little square near the center — the area depends on where you are, i.e. on \(r\). Every time you change variables in an integral, \(\det J\) is there acting as the exchange rate between areas.
3 · Honey on the dough (for the ML part of your brain). Spread a film of honey on the dough and then roll it out: where the dough widens, the film thins exactly by the area factor — mass is conserved. It's the change of variables for probability densities, \(p_Y(\mathbf y) = p_X(\mathbf x)\,/\,|\det J|\): the heart of normalizing flows, and the reason a linearly transformed Gaussian stays normalized with the right covariance.
Fridge-door summary
| Object | Question it answers | Picture | Where you'll see it again |
|---|---|---|---|
| eigenvector | which direction doesn't get deviated? | the Earth's axis; the rail | ellipsoid axes = columns of \(R\) |
| eigenvalue | by how much does it stretch along that rail? | the rolling pin ×2 | variances = squared scales \(s^2\) |
| Jacobian | how does it stretch space here? | the neighborhood map; the funhouse mirror | EWA projection, \(\Sigma' = JW\Sigma W^{\!\top}\!J^{\!\top}\) (lesson 05) |
| det (of the Jacobian) | how much do areas inflate here? | Mercator's Greenland; the pizza slice | splat area, changes of variable, anti-aliasing |