Matrices Chapter 3 of 3 · tap to browse
Eigenvectors — Directions That Don't Rotate
Some vectors only stretch — never rotate. Those directions encode the geometry of every matrix.
Principal Component Analysis — the algorithm behind face recognition, genomic analysis, and data visualisation — finds the eigenvectors of a dataset's covariance matrix. Each principal component is an eigenvector, and its eigenvalue is the variance of the data along that direction.
- 1 Recognise the eigenvector equation Av = λv and identify what each symbol represents: A is the matrix, v is the eigenvector, and λ is the eigenvalue.
- 2 Explain geometrically what an eigenvector is: a direction that stays on the same line through the origin after the transformation, only changing in length.
- 3 Interpret the sign and magnitude of an eigenvalue to predict how the transformation acts on its eigenvector: positive stretches or shrinks, negative flips, zero collapses to zero.
- 4 Explain how PCA uses eigenvectors: the principal components are the eigenvectors of the data's covariance matrix, sorted by eigenvalue from largest to smallest.
Directions That Don't Rotate
Apply a matrix to most vectors and something changes: the vector rotates to a new direction, not just a new length. But a few special directions survive the transformation with their direction intact — they only stretch or shrink, never rotating. These are eigenvectors.
The word “eigen” is German for “own” or “characteristic.” An eigenvector is a matrix’s own direction — the direction the transformation acts along most simply.
The eigenvector equation
If v is an eigenvector of matrix A with eigenvalue λ, then:
Reading this left to right: multiply the matrix A by the vector v, and you get back the exact same vector v, scaled by the scalar λ. The direction of v is unchanged. Its length is multiplied by |λ|.
A is the matrix, v is the eigenvector (a direction preserved by the transformation), and λ is the eigenvalue (the scalar stretch factor). To see this concretely, take A = [[2, 1], [1, 2]] and test three directions. Applying A to (1, 1):
The output (3, 3) is exactly 3 times the input (1, 1) — same direction, three times longer. So (1, 1) is an eigenvector with λ = 3. Testing (1, 0): the output is (2, 1), which points in a different direction — not an eigenvector. Testing (1, −1): A(1, −1) = (1, −1), so λ = 1 and the vector is completely unchanged.
What eigenvalues mean
The eigenvalue λ determines the effect on the eigenvector’s length and direction:
| λ value | Effect on eigenvector | Example |
|---|---|---|
| λ > 1 | Stretches in that direction | λ = 2 doubles the length |
| 0 < λ < 1 | Shrinks in that direction | λ = 0.5 halves the length |
| λ = 1 | Eigenvector is fixed — no change | Identity matrix: every vector is an eigenvector |
| λ = −1 | Flips direction, same length | Reflection through origin along that axis |
| λ = 0 | Collapses to zero — direction is destroyed | Singular matrix: that direction is lost |
Why this matters in machine learning: eigenvalues tell you which directions carry the most information and which carry the least. In Principal Component Analysis — the main application we will develop at the end of this chapter — you keep the eigenvectors with the largest eigenvalues and discard the rest, which gives you the most faithful lower-dimensional version of your data. The same logic reappears when analysing the landscape of a loss function during training: directions of large eigenvalue curve sharply (take small steps there to avoid overshooting), and directions of eigenvalue near zero are almost flat (the model can drift along them without much penalty). In both cases the eigenvalue is a measurement of how much “pull” the transformation has along a given direction.
A diagonal matrix’s eigenvectors are exactly the standard basis vectors. For A = [[3, 0], [0, 1]], multiplying A by (1, 0) gives (3, 0) — same direction, scaled by 3. Multiplying A by (0, 1) gives (0, 1) — unchanged. The diagonal entries 3 and 1 are the eigenvalues. Reading eigenvectors off a diagonal matrix requires no computation at all.
Not every matrix has real eigenvectors
A pure rotation matrix — one that turns every vector by the same angle θ (with θ ≠ 0° and θ ≠ 180°) — has no real eigenvectors. Every real vector changes direction under such a rotation; there is no stable direction in the plane. The eigenvalues of a rotation matrix are complex numbers that encode the rotation angle — they correspond to circular motion in the complex plane, not to any real direction in 2D space.
Symmetric matrices (where the entry at row i, column j equals the entry at row j, column i) are special: they always have real eigenvalues and their eigenvectors are always perpendicular to each other. The covariance matrices used in PCA — tables that encode how each pair of features in a dataset varies together — are symmetric by construction, and this is exactly why PCA always yields real, perpendicular principal components.
PCA: eigenvectors of a dataset
Karl Pearson introduced principal component analysis in 1901 as a method for fitting lines and planes to point clouds in multi-dimensional space. The same mathematical structure now operates on datasets with hundreds of thousands of features — from face-recognition systems that represent faces as high-dimensional vectors and compress them with PCA, to genomic studies where PCA of genetic variation data separates population clusters across continents.
Principal Component Analysis finds the eigenvectors of a dataset’s covariance matrix — the matrix that encodes how much each pair of features varies together. Each eigenvector is a direction in feature space. The corresponding eigenvalue is the variance of the data projected onto that direction: the larger the eigenvalue, the more information that direction carries.
Compressing a 1000-feature dataset to 10 dimensions via PCA retains the 10 eigenvectors with the largest eigenvalues and discards the rest. This is the best 10-dimensional approximation of the data in the least-squares sense. In practice, PCA is computed via singular value decomposition of the data matrix rather than explicitly forming the covariance matrix, which avoids numerical issues with large feature counts.
import numpy as np
A = np.array([[2, 1], [1, 2]])
eigenvalues, eigenvectors = np.linalg.eig(A)
print(eigenvalues) # [3. 1.]
print(eigenvectors) # columns are eigenvectors: (1/√2)[1, 1] and (1/√2)[1, −1]
# PCA via NumPy: eigenvectors of the covariance matrix
data = np.random.randn(100, 2) @ np.linalg.cholesky([[4, 2], [2, 2]]).T
cov = np.cov(data.T)
eigenvalues_pca, principal_components = np.linalg.eigh(cov)
# eigh is for symmetric matrices — always returns real eigenvaluesEigenvectors are not always the x and y axes. This is only true for diagonal matrices. For a symmetric matrix like [[2, 1], [1, 2]], the eigenvectors point along the diagonal directions (1, 1) and (1, −1), normalised — not along the coordinate axes. The eigenvectors depend entirely on the specific entries of the matrix.
What is an eigenvector?
An eigenvector of a matrix A is a nonzero vector v such that Av = λv — applying A to v gives back the same direction, scaled by λ. Geometrically, the eigenvector stays on its line through the origin: the transformation only stretches or flips it, never rotating it to a different direction. Every matrix has at most n independent eigenvectors in n-dimensional space.
What does the eigenvalue tell you?
The eigenvalue λ measures how much the eigenvector stretches. λ = 2 doubles the vector's length in that direction. λ = 0.5 halves it. λ = −1 flips the direction without changing length. λ = 0 collapses the eigenvector to zero — the matrix destroys all information in that direction, making it singular.
Does every matrix have eigenvectors?
Every 2×2 matrix has eigenvalues, but they may not be real numbers. A pure rotation matrix (other than 0° or 180°) has complex eigenvalues — no real eigenvector exists because every real vector changes direction under rotation. Symmetric matrices are the important special case: they always have real eigenvalues and their eigenvectors are perpendicular to each other.
Look at the matrix [[2, 1], [1, 2]]. Pick a direction — any direction — and predict whether a vector pointing that way will stay pointing that way after the matrix is applied. Try the horizontal direction (1, 0), the diagonal (1, 1), and the anti-diagonal (1, −1). Which of those three do you think is an eigenvector? Make a prediction before you interact with the playground. Now predict the eigenvalues. For each direction you identified as an eigenvector, how much does it stretch or compress under [[2, 1], [1, 2]]? Does it double in length, halve, stay the same, or flip? Try to give a numeric guess for λ₁ and λ₂ before checking.
What Eigenvectors Reveal
A vector v is an eigenvector of matrix A when applying A leaves v on the same line through the origin. Formally, Av = λv for some scalar λ. The transformation scales v by λ but never rotates it to a new direction. Geometrically, the eigenvector is a fixed direction of the transformation: an axis along which the matrix acts purely as a stretch.
The sign of λ determines whether the vector flips. If λ = 3, the eigenvector triples in length pointing the same way. If λ = −2, it doubles in length and points the opposite way. If λ = 0, the eigenvector is crushed to zero — the matrix destroys that direction entirely, making it singular.
A rotation matrix [[cos θ, −sin θ], [sin θ, cos θ]] for θ ≠ 0° and θ ≠ 180° has no real eigenvectors. Every vector in the plane rotates by θ; none remains on its original line. The eigenvalues are complex numbers encoding the rotation angle — they correspond to no real direction in 2D space. This is why the vector field of a rotation matrix shows all arrows turning uniformly with no stable axis.
Symmetric matrices — where A[i][j] = A[j][i] for all i, j — are the most important special case. A fundamental property of symmetric matrices guarantees they always have real eigenvalues and that their eigenvectors are mutually perpendicular. A 2×2 symmetric matrix [[a, b], [b, d]] always has exactly two real eigenvalues and two perpendicular real eigenvectors, regardless of the specific entries.
The covariance matrix of a dataset is always symmetric. Its eigenvectors are the principal components — the directions in feature space along which the data has the most spread. The eigenvalues measure the variance along each eigenvector. Sorting eigenvectors by eigenvalue from largest to smallest and projecting the data onto the top k gives the best possible k-dimensional representation in the least-squares sense. This is why PCA consistently outperforms arbitrary linear projections for data compression: it finds the mathematically optimal directions by solving an eigenvalue problem.
An eigenvector v satisfies Av = λv. Geometrically, it stays on the same line through the origin after transformation — only its length changes. The eigenvalue λ is the stretch factor: positive means same direction, negative means flipped, zero means collapsed.
A pure rotation matrix in 2D has no real eigenvectors. Every real vector changes direction under rotation; there is no stable axis in the plane. This is fundamentally different from scaling or shearing, both of which preserve at least one direction.
Symmetric matrices always have real eigenvalues and perpendicular eigenvectors. The covariance matrices used in PCA are always symmetric — which is why PCA always yields real, orthogonal principal components.
The first principal component of a dataset is the eigenvector of the covariance matrix with the largest eigenvalue — the direction along which the data varies most. Projecting onto the top k eigenvectors gives the best k-dimensional summary of the data.
Check Your Understanding
Four questions on eigenvectors, eigenvalues, and PCA. Select an answer, then reveal to see the explanation.
A matrix A has an eigenvector v with eigenvalue λ = −2. Which statement correctly describes what happens when A is applied to v?
A pure 2D rotation matrix (rotating by 45°) has two real eigenvectors.
In Principal Component Analysis (PCA), what are the principal components?
In the eigenvector equation Av = λv, what does the eigenvalue λ represent?