Determinants and The Rank-Nullity Theorem

Getting familiar with matrices

Jul 16, 2024

Hey everyone! This is part 5 of my linear algebra series. Here are all the previous posts to catch you up:

To recap, we’ve seen that a matrix is an array of numbers which can represent both a system of linear equations, and also a linear transformation of vectors in space.

Our goal is to understand how the geometric aspect of these vector transformations relate to solutions for those systems of equations. We want to build a toolbox of ways to describe what the linear transformation (AKA the matrix) is doing, and we’ll also see an important theorem!

The Determinant
The Kernel and Image
The Rank-Nullity Theorem

1. The Determinant

When I was in school, I learned about determinants in the most boring way. For a 2x2 matrix, A, the determinant is calculated as:

\(det(A) = ad-bc, \text{ where } A = \begin{bmatrix} a & b \\ c & d \end{bmatrix}\)

Which you just had to get used to by lots of practice. Here are some examples:

\(A = \begin{bmatrix} 2 & 0 \\ 0 & 1 \end{bmatrix} \hspace{1cm} det(A) = 2\)

\(B = \begin{bmatrix} 1 & 2 \\ 3 & 0 \end{bmatrix} \hspace{1cm} det(B) = -6\)

\(C = \begin{bmatrix} 1 & 2 \\ 3 & 6 \end{bmatrix} \hspace{1cm} det(C) = 0\)

But I never really understood why we were calculating this number.

The determinant (to me) makes more sense when we define it geometrically, thinking about matrices as linear transformations:

The determinant is the signed area of the image of the unit square.

For example, look at the linear transformation associated with the matrix A:

This is the horizontal stretch transformation we’ve seen before, that doubles all the x-coordinates. See how the 1x1 unit square gets stretched into a 2x1 rectangle? That rectangle is the image of the unit square, since it’s the result of the linear transformation. Its area doubles, which is why the determinant of this linear transformation is 2. We can check it numerically too, using the formula from above:

\(det(\begin{bmatrix} 2&0\\0&1 \end{bmatrix}) = (2)(1) - (0)(0) = 2 - 0 = 2\)

But the definition of the determinant is the signed area, so this number can be negative. Let’s see another example, with matrix B:

This time, the 1x1 unit square gets mapped to a 2x3 parallelogram with area 6. However, the orientation of the edges changed, so the determinant of this transformation is -6. And we can verify it numerically:

\(det(\begin{bmatrix} 1&2\\3&0 \end{bmatrix}) = (1)(0) - (2)(3) = 0 - 6 = -6\)

Finally, let’s see what a determinant of 0 looks like, which should correspond to a transformation that changes the unit square into a figure with 0 area.

We’ll take the matrix from the previous example and tweak d, the bottom right entry of the matrix:

A determinant of 0 happens when the unit square gets squished down to 0 area, which happens when the entire space gets squished down a dimension. In this case, the R2 plane got squished down to the line y=3x, and the two basis vectors were parallel.

What do you notice about the matrix C when the determinant was 0?

\(det(\begin{bmatrix} 1&2\\3&6 \end{bmatrix}) = (1)(6) - (2)(3) = 6 - 6 = 0\)

The columns of the matrix are scalar multiples of each other (the right column is double the left column)
The rows of the matrix are scalar multiples of each other (the bottom row is triple the top row)

Both of these translate to the columns and rows being linearly dependent instead of linearly independent.

A matrix has a determinant of 0 precisely when its columns and rows are not linearly independent. This corresponds to a system of equations that has no unique solution, because it looks like parallel lines on the plane which either don’t intersect at all, or intersect at infinitely many points.

The other way for a matrix to have a determinant of 0 is for the matrix to simply map everything to 0, which is what the zero matrix does:

\(\begin{bmatrix} 0&0 \\ 0&0 \end{bmatrix}\)

A square matrix with linearly dependent columns or rows will always have a determinant of zero, but there’s clearly a difference between flattening space down to a line versus flattening space into a single point — the former reduces the 2-dimensional vector space into a 1-dimensional line, while the other reduces it into a 0-dimensional point! We need to introduce new terminology to be able to talk about this difference, since the language of determinants and linear independence isn’t enough.

2. Kernel and Image

Let A be a matrix which represents a linear transformation T between two vector spaces, V and W.

\(T: V \rightarrow W\)

Then the kernel of T is defined by:

\(ker(T) = \left\{\vec{v} \in V: T(\vec{v}) = 0\right\}\)

While the image of T is defined by:

\(Im(T) = \left\{\vec{w} \in W: \exists \vec{v} \in V \text{ such that } T(\vec{v}) = \vec{w}\right\}\)

You may be familiar with calculating the domain, range, and zeros of a function.

The kernel is just the set of all zeros of a linear transformation — it’s the set of all vectors that A maps to 0. And the image is the range, which is the the set of all vectors that A can map to. Another way to think about it is that it’s the span of all of the columns of A. Consider these examples:

\(A = \begin{bmatrix} 0&0 \\ 0&0 \end{bmatrix} \hspace{1cm} ker(A) = \mathbb{R}^2 \hspace{1cm} Im(A) = \left\{ \begin{bmatrix} 0\\0 \end{bmatrix} \right\}\)

The zero matrix maps everything to zero, so the kernel is all of R2, which is a 2-dimensional vector space. The image is simply the zero vector, which is a 0-dimensional vector space.

\(C = \begin{bmatrix} 1&2 \\ 3&6 \end{bmatrix} \hspace{1cm} ker(C) = Span \left\{ \begin{bmatrix} 1\\-\frac{1}{2} \end{bmatrix} \right\} \hspace{1cm} Im(C) = Span\left\{ \begin{bmatrix} 1\\3 \end{bmatrix} \right\}\)

This one might need a little more explanation. Recall that C is that matrix that squished space down to a straight line. That straight line had the equation y=3x, and we can verify that under this transformation, any vector (x,y) gets transformed into a vector of the form (a,3a), so the image of this transformation is the vector space spanned by the vector (1,3).

\(\begin{bmatrix} 1&2 \\ 3&6 \end{bmatrix} \begin{bmatrix} x \\ y \end{bmatrix} = \begin{bmatrix} x + 2y \\ 3x + 6y \end{bmatrix} = \begin{bmatrix} (x + 2y) \\ 3 (x + 2y )\end{bmatrix} = \begin{bmatrix} a \\ 3a \end{bmatrix} , a = x+2y\)

However, this fails whenever a = x+2y is equal to 0, which happens when y = -1/2x. The only vectors which get mapped to 0 under this transformation are vectors which are in the span of (1, -1/2), hence this is the kernel of this transformation.

In this case, both the kernel and the image are 1-dimensional vector spaces.

3. The Rank-Nullity Theorem

Okay, now it’s time for a theorem!

\(\text{Let } T: V \rightarrow W \text{ be a linear transformation where } V \text{ has finite dimension.}\)

\(dim(V) = dim(ker(T)) + dim(Im(T))\)

In other words, the dimension of V is equal to the dimension of the kernel plus the dimension of the image!

This is the rank-nullity theorem because

Rank = dimension of the kernel, and

Nullity = dimension of the image

We’ve already seen a bunch of examples of this on 2D planes, so let’s look at an example in 3D.

\(A = \begin{bmatrix} 1&0&0 \\ 0&2&0 \\ 0&0&3 \end{bmatrix} \hspace{1cm} ker(A) = \left\{ \begin{bmatrix} 0\\0\\0 \end{bmatrix} \right\} \hspace{1cm} Im(A) = \mathbb{R}^3 \)

This transformation spans the entire 3D vector space, so the image is all of R3, and the kernel is merely the zero vector.

Dimension of kernel: 0

Dimension of image: 3

\(B = \begin{bmatrix} 1&0&1 \\ 0&1&1 \\ 1&1&2 \end{bmatrix} \hspace{1cm} ker(B) = Span\left\{ \begin{bmatrix} 1\\1\\-1 \end{bmatrix} \right\} \hspace{1cm} Im(A) = Span\left\{ \begin{bmatrix} 1\\0\\1 \end{bmatrix}, \begin{bmatrix} 0\\1\\1 \end{bmatrix} \right\}\)

The third column of this matrix is just the sum of the first two columns, so the columns are not linearly independent. Only the first two columns are linearly independent, so the image is the span of those two vectors, which is a plane, and the kernel is a line.

Dimension of kernel: 1

Dimension of image: 2

\(C = \begin{bmatrix} 1&1&1 \\ 2&2&2 \\ 3&3&3 \end{bmatrix} \hspace{1cm} ker(C) = Span\left\{ \begin{bmatrix} -1\\1\\0 \end{bmatrix}, \begin{bmatrix} -1\\0\\1 \end{bmatrix} \right\} \hspace{1cm} Im(C) = Span\left\{ \begin{bmatrix} 1\\2\\3 \end{bmatrix} \right\}\)

Now all three of the columns/rows are scalar multiples of each other, so we only have 1 linearly independent column vector here, thus the image is merely a line, while the kernel is a plane.

Dimension of kernel: 2

Dimension of image: 1

The kernel and the image have a zero-sum relationship, where making one of them smaller directly makes the other one larger. That’s what the rank-nullity theorem is telling us!

Now, we have more ways to describe matrices and linear transformations. There are still lots more definitions and theorems we can discuss here but I won’t write about all of them!

My goal is to just summarize the core ideas of what you might see in a linear algebra class. There’s only one more topic I want to cover: eigenvectors. After that, we’ll tie everything together with the grand theorem of linear algebra, The Invertible Matrix Theorem. You’ll see that this grand theorem will connect all the ideas we’ve seen so far: bases, determinants, systems of equations, kernels, and images!

Until next time, thank you for reading!

The Math Queen Digest

Discussion about this post