Vectors Chapter 1 of 3 · tap to browse

01 What Is a Vector? 02 Vector Operations 03 Vector Spaces and Basis

What Is a Vector?

From directed arrows to lists of numbers — and why that matters in ML

When Netflix recommends a film, your viewing history exists as a vector of hundreds of numbers. The algorithm finds films whose vectors point in the same direction as yours — quite literally. This chapter shows you what that means.

Learning Objectives

1 Explain what makes a vector different from a plain number, in both geometric and algebraic terms.
2 Identify the components, magnitude, and direction of a vector given its arrow or coordinate notation.
3 Recognise at least two concrete ways ML systems use vectors to represent real-world data.

¶ Narrative

Vectors as Arrows and Coordinates

The arrow that carries two ideas at once

Draw an arrow on a piece of paper. That arrow has two properties: it is a certain length, and it points in a particular direction. Change either property and you have a different arrow.

That arrow is a vector.

A vector is any quantity that has both magnitude (size) and direction. A number with only magnitude — like temperature or price — is a scalar.

In a neural network, every layer’s output is a vector of numbers — activation values passed forward to the next layer. The same mathematical object describes a GPS displacement, a pixel in an image, and a word in a language model.

In two dimensions, a vector from the origin to the point (4, 2) looks like the diagram below. The horizontal axis registers the x-component; the vertical axis registers the y-component.

The vector (4, 2) as an arrow. The dashed lines show its components projected onto each axis.

The components are simply the shadow of the arrow onto each axis. A vector with a large x-component points far to the right; a vector with a large y-component points steeply upward.

From arrows to coordinate notation

Writing vectors as arrows is fine in 2D, but impractical in 768 dimensions (a common size for language model embeddings). So we write them as ordered lists of numbers — one number per dimension. A 2D vector is written as v = (4, 2); a 768D embedding is a list of 768 numbers, one per axis direction.

The order matters: (4, 2) and (2, 4) are different vectors — they point in different directions. Each position in the list corresponds to a specific axis.

Magnitude: the length of the arrow

The magnitude (or norm) of a vector is its length. For a 2D vector, it is the hypotenuse of the right triangle formed by the components — which is exactly the Pythagorean theorem.

Magnitude via Pythagoras. The vector (3, 4) has components 3 and 4; its magnitude is √(9 + 16) = 5.

The magnitude of a 2D vector v = (vₓ, vᵧ) is written ‖v‖ and equals ‖v‖ = √(vₓ² + vᵧ²).

This formula extends to any number of dimensions by summing all squared components under the square root — the geometry of distance works the same way in 768D as in 2D.

python

import numpy as np

v = np.array([3, 4])
norm = np.linalg.norm(v)       # 5.0
v_unit = v / norm               # array([0.6, 0.8])

Direction without size: unit vectors

Sometimes you care only about which way a vector points, not how long it is. A unit vector is a vector with magnitude exactly 1 — pure direction, no size.

To produce a unit vector, divide by the magnitude: û = v / ‖v‖. This operation is called normalisation.

Common Mistake

Normalisation sets magnitude to 1 — it does not make all components equal to 1. The vector (3, 4) normalised becomes (0.6, 0.8), not (1, 1).

Normalisation shrinks any vector to magnitude 1 while preserving its direction exactly.

Why ML systems love vectors

Machine learning algorithms need to do arithmetic on data. Vectors make that possible.

Real World

Recommendation systems represent every user and every film as a vector of learned numbers. The closer two vectors point (measured by the dot product), the more similar those items are. This is why “users who watched X also watched Y” works — X and Y have similar vectors.

Language models map every word into a high-dimensional vector (an embedding) such that semantically similar words have geometrically similar vectors. The word “king” minus “man” plus “woman” famously produces a vector close to “queen” — vector arithmetic capturing meaning.

Scalars vs vectors in the real world — and the high-dimensional version used in language models.

The pattern is always the same: take something complex (a word, a film, a user’s history), map it to a list of numbers, and suddenly the entire toolkit of linear algebra becomes available.

In this section

Is a vector just a list of numbers?

In code, yes — but that misses the geometry. A vector is a directed displacement: it has both a magnitude (how far) and a direction (which way). The list of numbers is a coordinate description relative to a chosen basis; the underlying object is independent of those coordinates.

What's the difference between a vector and a scalar?

A scalar is a single number with magnitude only — temperature, price, probability. A vector carries magnitude and direction. Scalars can scale vectors (change their length without changing direction), but cannot replace them when direction matters.

Why do ML models represent data as vectors?

Vectors let you do arithmetic on structure: measure similarity (dot product), transform data (matrix multiplication), and optimise objectives (gradient descent). Converting text, images, or user histories to vectors means every ML tool becomes available.

Key Terms

Vector Scalar Magnitude Unit Vector

◎ Intuition

You have a vector **v** = (3, 4) with magnitude 5. You double both components to get (6, 8). What happens to the direction? What happens to the magnitude? Now suppose you increase only vₓ — from 3 to 6 — while keeping vᵧ = 4 fixed. Does the direction stay the same? Does the magnitude change by the same amount as before? Finally, imagine two arrows drawn from the origin: one to (2, 3) and one to (4, 6). Do they point in the same direction? Could one be a scalar multiple of the other?

↺ Reflection

What You Just Saw

A vector in 2D is completely described by its two components. Changing one component changes both the direction and the magnitude of the arrow — the other component is unaffected. Magnitude combines both components via Pythagoras: ‖(3, 4)‖ = √(9 + 16) = 5. Moving from (3, 4) to (3, 8) changes the magnitude from 5 to √(9 + 64) = √73 ≈ 8.5 — not proportional to the component change.

The unit vector of any non-zero vector v is û = v / ‖v‖ — the same direction, rescaled so that ‖û‖ = 1. The unit vector captures direction without magnitude. This is what normalisation layers in neural networks do to activation vectors: the direction is preserved, but the magnitude is discarded.

Real-world vectors have hundreds or thousands of components, but the geometry is the same. “Similar” items in an ML system correspond to vectors whose directions are close — their dot product is large, their angle is small. The intuition that works in 2D — direction as meaning, magnitude as scale — applies unchanged in 768D, 4096D, or beyond.

Key Points

A vector is fully described by its components — two numbers in 2D, n numbers in n dimensions.

Magnitude is the Euclidean length of the arrow: √(vₓ² + v_y²). It combines both components via Pythagoras — changing only one component still changes the magnitude.

Normalising (dividing by magnitude) keeps direction while collapsing magnitude to 1 — the core of many ML similarity measures.

✓ Checkpoint

Check Your Understanding

Four questions on vectors, magnitude, and direction. Answers are revealed when you expand each question — there is no score.

You multiply every component of a vector by 3. What changes?

The vector (1, 0) and the vector (5, 0) point in the same direction.

What is the magnitude of the vector (3, 4)?

A language model stores the word "cat" as a vector of 768 numbers. What does this vector represent?