In this post, I want to bridge the gap between abstract vector spaces (which are the mathematical foundation of linear algebra) and matrix multiplication (which is the linear algebra most of us are familiar with). To do this, we will restrict ourselves to a specific example of a vector space – the Euclidean space. Unlike the typical 101 course in linear algebra, I will avoid talking about solving systems of equations in this post. While solving systems of equations served as the historical precedent1 for mathematicians to begin work on linear algebra, it is today an application, and not the foundation of linear algebra.
For this post, I expect that the reader has come across concepts like linear independence and orthogonal vectors before, and can consult Wikipedia for anything that looks new to them.
The Recipe for
We write as a short-hand for , the set of sequences (of length ) of real numbers. For notational convenience, we also use ‘’ to denote the -dimensional Euclidean space, which is not just a set of objects, but a set of objects that has a particular structure. In order to arrive at this structure, we need to introduce the following mathematical ingredients, in order:
-
Scalars: Defined as the elements of a set (technically, a field ) which has two binary operations called addition and multiplication. We choose (the real numbers) as the set of scalars.
-
Vectors: For some integer , we define the set of vectors as . The vectors have the vector addition and scalar multiplication operations. These operations satisfy certain axioms which ensure that the addition and multiplication operations behave like they ought to.
- Basis: We need to pick a basis for , which is a set of vectors , where , such that every vector can be uniquely expressed as a linear combination of the basis vectors. This means that there is a unique sequence of real numbers satisfying
- Inner Product: For vectors and , is called the inner product of and ; it maps each pair of vectors to a scalar. The usual inner product that we define for is sometimes called the dot product. An inner product imparts geometry to its vector space, because we can use it to define the ’length’ of a vector as , and ‘angles’ between vectors as
- Orthonormal Basis: If the basis is such that when and otherwise, we call it an orthonormal basis. Because of how we defined , implies that .
We have introduced ingredients 3, 4, and 5 in a very specific order. Let’s see why that is so.
The Standard Basis
Mathematicians avoid picking the basis explicitly. Often, they start their analysis with the following (implied) disclaimer:
“We have chosen some basis, , but the specific choice of basis does not matter for what we’re about to show.”
Basically, don’t worry too much about which basis we chose, just know that we have chosen one. Once a basis has been chosen, each vector can be uniquely expressed by a sequence of coefficients, , such that . Thus, the vector can be expressed unambiguously using the following, more familiar notation:
Note that this notation involves both a vector and a basis . Choosing a different basis changes the coefficients of the vector to , but it does not change the vector itself. For bases and , we have
At a glance, this assertion might appear to contradict with the following observation:
This is purely because of the ‘square-bracket’ notation. Before we write vectors in their ‘square-bracket’ form, we must not only choose a basis, but also fix a basis. Let’s fix a basis for , which we call as the standard basis. Now, for , the ‘square-bracket’ notation
refers unambiguously to the vector . Therefore, observe that
Thus, there is a distinction between the vector itself and its representation in the standard basis ; the ‘square-bracket’ notation gives us the latter, and it is our job to infer the former. Observe that the standard basis vectors can themselves be represented in the ‘square-bracket’ notation, as
Notice that we can do our usual linear algebra stuff without actually specifying the contents of , as long as we fix and don’t change it thereafter. Nothing about the orthogonality of has been said yet, because we need an inner product to even define what orthogonality means.
The Dot Product
We can now define an inner product in terms of the standard basis . For vectors , we define , which we call as the dot product. In the matrix multiplication or “square-bracket” notation, we write this as
Note that we are defining the inner product this way. Importantly, we are defining it in a way that makes the basis vectors, , orthonormal. If we had instead defined the inner product as , then the basis becomes orthonormal (under this new definition of orthonormality). Thus, any basis can be ‘made orthonormal’ by redefining the inner product appropriately.
The ‘row vector’ corresponding to is usually called the transpose of , and is denoted as . Strictly speaking, it is a linear map (See dual space if you’re curious about what’s going on here.)
Linear Algebra
Let and be vector spaces. They could be Euclidean spaces, but they could also be subspaces of Euclidean spaces (recall that a flat plane passing through the origin is a subspace of ), or something else entirely. A linear map or a linear transformation is a map which transforms each vector in to a vector in in a linear manner. This means that for and ,
and
Notably, we have . The word ’linear’ comes from the special case of the linear map, ; the plot of this function is a straight line passing through the origin. This is also where the ’linear’ in linear algebra comes from: it is the study of linear maps in vector spaces.
Now here’s where abstract linear algebra starts developing into the ‘matrix multiplication’ version of linear algebra:
Any linear map between two finite-dimensional vector spaces and can be represented as a matrix.
To see this, let’s start by choosing bases for and , denoted as and , where and are the dimensions of and . For simplicity, we will assume that the scalars in and are real numbers (as opposed to, say, one of them being a complex vector space).
Observe that . Each vector in the basis of is mapped (linearly) to a corresponding vector in . This means that we can express each of the mapped basis vectors as a linear combination:

where are unique. Now consider the action of on an arbitrary vector that is not a basis vector. We first write as the linear combination
Due to the properties of a linear transformation (i.e., its linearity), we have the following algebra:
Thus, the action of on the vector indirectly depends on the action of on the basis vectors. We have already seen where takes the basis vectors of , so let’s plug that in:
where is the coefficient of corresponding to the basis vector . From here on, it’s only a matter of noticing that we can represent this entire relationship using the “matrix-multiplication” operation:
which we can write as “”. There is a subtlety here: on the left-hand side of this equation, we assume the ‘standard basis’ to be , whereas for the vector on the right we were using the standard basis . Thus, we need to fix both bases (one for and one for ) before the linear transformation can be written, unambiguously, as a matrix multiplication. If the dimensions of and are the same, we may pick the same basis on either side.
Observe that we never used the inner product while talking about linear transformations, and thus, we do not claim whether the bases we used above are orthonormal. They are simply linearly independent, as all bases are. In case the basis is orthonormal, then this just means that we can find the coefficients very easily: .
Orthonormal Transformations
Let’s now study as an inner product space, which is the vector space combined with the usual inner product – the dot product.
We say that a matrix is orthonormal if . This is closely related to how we say that a set of basis vectors is orthonormal: Suppose is an orthonormal basis, then so is the basis , because
Let the underlying linear transformation corresponding to be denoted as , with being an orthonormal basis for . is the representation of in the matrix multiplication form, with respect to the basis . Recall the algebra we did earlier:
where we know that the set is orthonormal. Thus, and have the same representation (given by the numbers ) under and . This is why we can call a “change of basis” – it keeps the vector’s representation the same, but changes the (orthonormal) basis that we are representing it in. Even if the vector’s representation is same in either basis, the vector itself is changing under :
Alternatively, we can re-express the transformed vector in the original basis , in which case is interpreted as purely a transformation of the vector’s components while keeping the basis fixed. This duality in how we can view a ‘change of basis’ has been explored more in this article .

Preserving Structure and Dimension
Any transformation on a mathematical space that preserves its structure (i.e., the relationships of its objects to each other) turns out to be quite special. Linear transformations preserve the structure of a vector space, because any three vectors which have the relationship are still related to each other after the transformation: .3
Structure-preserving transformations which are also invertible are called as isomorphisms . We can show that the inverse , if it exists, must also be a linear transformation. Thus, can be represented as a matrix. Invertible linear transformations are the isomorphisms of vector spaces. Invertible matrices are “square” because a linear transformation can only be invertible if its domain and codomain have the same dimension. 4
Similarly, orthonormal matrices represent the structure-preserving transformations in inner-product spaces : a set of vectors that is orthonormal before the transformation remains orthonormal after the transformation, where orthonormality is defined via the dot product. They are also the isomorphisms of inner-product spaces, because the inverse of an orthonormal matrix always exists it is .
Mathematicians almost always (or perhaps, always) study mathematical objects “up to isomorphism”. This means that we are not studying any particular mathematical object, but rather we are simultaneously studying all of the mathematical objects that are isomorphic to each other. This is why we do not need to specify which basis we are using as the standard basis: it simply does not matter, as long as we fix this basis and stay consistent. This is analogous to how we may need to fix the origin when studying ‘displacement’ and ‘speed’ in physics. Choosing a different origin does not change the physical phenomenon, it only changes our description of it.
-
See this for the historical context of matrix multiplication, which is different from (but essentially the same as) modern mathematics’ treatment of it. ↩︎
-
The words every and unique can be compared to the concepts of surjectivity (also called as onto) and injectivity (also called as one-one), respectively. A function between two sets is invertible if and only if it is both surjective and injective. The ‘sets’ here are the vectors and their representations. ↩︎
-
There is an abuse (or rather, a reuse) of notation here; note that the vector addition in may be different from the vector addition in , though we denote both as ‘’ for convenience. We also use ‘’ to denote the scalar addition operation. ↩︎
-
An invertible function between sets must be injective and surjective. If the dimension of is greater than that of , then cannot be surjective. If the dimension of is greater, then cannot be injective. ↩︎