The inverse of a matrix is not exactly an easy task if you have not yet been introduced to Gaussian Elimination. The inverse of a matrix is used in a large number of algorithms, one of the simplest being Linear Regression.

There are two steps to inverting a matrix:

  1. Checking if the matrix is invertible by finding the Determinant
  2. Inverting the matrix by a Gaussian Elimination variant Gauss-Jordan

Table Of Contents (Click To Scroll)

  1. Checking If A Matrix Is Invertible
  2. Inverting A Matrix
  3. References

Checking If A Matrix Is Invertible

The concept of rank in linear algebra relates to dimensionality. All matrices are of some size $m \times n$, where $m$ is the number of rows and $n$ the number of columns.

Matrix Rank

  1. Rank $1$: When a matrix is a line, it has rank $1$.
  2. Rank $2$: When a matrix is a plane, it has rank $2$.
  3. A matrix is said to have Full Rank if and only if the matrix does not contain a column as a linear combination of two columns because that means

Any given matrix can only have as high a rank as the number of columns $n$ for that matrix. But a matrix with $n=3$ might only be of rank $2$, because some transformation might squish the column vectors onto a plane – and the rank could even be decreased to rank $1$, if a transformation squishes the vectors onto a straight line.

Let's examine what this means.

Linear Combinations

We define three vectors $\vec{v}$, $\vec{u}$, and $\vec{w}$, and we combine them as column vectors in a matrix $A$.

$$ \vec{v} = \begin{bmatrix} 2\\ 0\\ 1 \end{bmatrix} ,\quad \vec{u} = \begin{bmatrix} 1\\ 2\\ 1 \end{bmatrix} ,\quad \vec{w} = \begin{bmatrix} 1\\ -2\\ 0 \end{bmatrix} ,\quad A = \begin{bmatrix} 2 &1 &1 \\ 0 &2 &-2 \\ 1 &1 &0 \end{bmatrix} $$

If we perform vector addition on $\vec{u}$ and $\vec{w}$, then the result is the vector $\vec{v}$. This means that $\vec{v}$ is a linear combination of $\vec{u}$ and $\vec{w}$, i.e. the matrix $A$ does not have full rank.

$$ \vec{u} + \vec{w} = \begin{bmatrix} 1+1\\ 2+(-2)\\ 1+0 \end{bmatrix} = \begin{bmatrix} 2\\ 0\\ 1 \end{bmatrix} $$

Similarly, if you subtract $\vec{v}$ and $\vec{u}$, then you find that they are a linear combination of $\vec{w}$.

The geometrical interpretation of linear combinations is that we lose information if we were to use that specific transformation associated with the matrix $A$, because the resulting matrix of the transformation no longer spans the full three dimensions of space.

The above matrix $A$ is of rank $2$, and as we will see later on, we can compute the rank of a matrix by trying to reduce the matrix to reduced row echelon form. But for now, we can determine if a matrix can be inverted by figuring out the determinant.

Computing The Determinant

When the determinant of a matrix is zero, the rank of the matrix is not full rank, meaning that we cannot invert the matrix. Similarly, there are 23 other properties that you equivalently can use to check if a matrix is invertible.

For a 3x3 matrix, the following is the formula:

$$ \begin{align} det(A) &= det\left(\begin{bmatrix} a&b&c\\ d&e&f\\ g&h&i \end{bmatrix}\right)\\ &=aei + bfg + cdh - ceg - bdi - afh \end{align} $$

So let's plug in the number and find the solution.

$$ \begin{align} det(A) &= det\left(\begin{bmatrix} 2 &1 &1 \\ 0 &2 &-2 \\ 1 &1 &0 \end{bmatrix}\right)\\ &=2*2*0 + 1*(-2)*1 + 1*0*1 - 1*2*1 - 1*0*0 - 2*(-2)*1\\ &=0 \end{align} $$

There we have it, the matrix $A$ is non-invertible, since the determinant is zero, though if we instead had an invertible matrix, the determinant is non-zero. In other words, we cannot find the inverse for the matrix $A$, but if we instead had another matrix $B$ where $det(B) \ne 0$, we would be able to find the inverse.

Note that computing the determinant is only possible for square matrices, i.e. the matrices where the number of rows $m$ is equal to the number of columns $n$.

Inverting A Matrix

Once we have checked if a matrix is invertible, and it turns out that it is invertible, then we can go ahead and use Gaussian Elimination to invert the matrix. Specifically, the method is called Gauss-Jordan elimination, which is a variant of the Gaussian Elimination algorithm.

Initializing The Algorithm

The following matrix $B$ is invertible since the determinant of the matrix is non-zero. We will use the matrix as an example.

$$ B = \begin{bmatrix} 1 &5 &-1 \\ 2 &2 &-2 \\ -1 &4 &3 \end{bmatrix} $$

The way we initialize the algorithm is by gathering the matrix $B$ on the left side, separated by a vertical line, and then the identity matrix of $B$, denoted $I$. The identity matrix is a matrix with one's through the diagonal from top left towards the bottom right (main diagonal) and zero's in the rest of the entries.

$$ [B|I] =\left[ \begin{array}{ccc|ccc} 1 &5 &-1 &1 &0 &0\\ 2 &2 &-2 &0 &1 &0\\ -1 &4 &3 &0 &0 &1 \end{array} \right] $$

When performing operations, we perform them on both the matrix $B$ and $I$. After many operations, matrix B will be converted to row echelon form, while the identity matrix $I$ will be the inverse of matrix $B$.

The goal of Gauss-Jordan elimination is to convert a matrix to reduced row echelon form.

Let's explore what this means for a minute.

Reduced Row Echelon Form

For a matrix to be in reduced row echelon form, it must satisfy the following conditions:

  1. All entries in a row must be $0$'s up until the first occurrence of the number $1$.
  2. The first $1$ in a row is always to the right of the first $1$ in the row above.
  3. All rows that only consist of $0$'s are placed below rows that do not.
  4. The first $1$ in a row only contains $0$'s in the same entry in all the rows above the $1$. In other words, the first $1$ in a row only contains $0$'s in that same column for all the rows above the $1$.

These conditions imply that the row below the first $1$ in any given row contains $0$'s below and to the left of that first $1$.

Here are examples of matrices that ARE in reduced row echelon form:

$$ A = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{bmatrix} ,\quad B = \begin{bmatrix} 1 & -2 & 0 & 0\\ 0 & 0 & 1 & 0\\ 0 & 0 & 0 & 1 \end{bmatrix} ,\quad C = \begin{bmatrix} 1 & 0 & -2 & 0 & 3 \\ 0 & 1 & 3 & 0 & 5\\ 0 & 0 & 0 & 1 & 6 \end{bmatrix} $$

Here are examples of matrices that ARE NOT in reduced row echelon form:

$$ A = \begin{bmatrix} 2 & 1 & 0 \\ 0 & 0 & 1 \\ 0 & 0 & 0 \end{bmatrix} ,\quad B = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 1 & 0 \end{bmatrix} ,\quad C = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 0 & 0 \\ 0 & 1 & 0 \end{bmatrix} ,\quad D = \begin{bmatrix} 1 & 7 & 0 \\ 0 & 1 & -2 \\ 0 & 0 & 1 \end{bmatrix} $$

Note that a matrix in reduced row echelon form is also in row echelon form.

Notation And Operations

The following are the operations that we can perform during this elimination process, to convert the matrix to reduced row echelon form.

Generally, the accepted notation for performing row operations looks like the following:

$$R_{\text{some row number x}} \rightarrow  R_{\text{some other row number y}} $$

Which means that $R_x$ becomes equal to the row $R_y$. But it's not that simple, because we can use three basic row operations that can quickly become confusing.

  1. Row Addition: We can add one row to another row,
    e.g. $R_3 \rightarrow R_2 + R_3$.
    Note that this also means we can do subtraction,
    e.g. $R_3 \rightarrow -R_2 + R_3$.
  2. Row Multiplication: We can add or subtract the same row by some amount,
    e.g. $R_3 \rightarrow 2R_2 - 1/4R_1 + R_3$.
    Note that we also can perform division when doing multiplication.
  3. Row Interchanging: We can swap rows in a matrix, so they take each other's place, e.g. $R_3 \rightarrow R_2$ and $R_2 \rightarrow R_3$.

You can even think of very simple row multiplications or division where you divide a row by $2$, or you multiply a row by $3$.

When performing these matrix operations by hand, it is important to keep it clean and very organized, else you will quickly lose sight of what you were doing. All the arithmetics can be hard to keep track of.

Complete Example

We finally have it all together, we understand what reduced row echelon form is, and we are familiar with the operations we can perform to convert a matrix to reduced row echelon form.

Now, it is time to look at an example and apply what we just learned. That is, apply the Gauss-Jordan algorithm to invert a matrix. Remember that the goal is to convert the matrix to reduced row echelon form, which means the matrix must satisfy certain conditions listed in the Reduced Row Echelon Form section.

$$ [B|I] =\left[ \begin{array}{ccc|ccc} 1 &5 &-1 &1 &0 &0\\ 2 &2 &-2 &0 &1 &0\\ -1 &4 &3 &0 &0 &1 \end{array} \right] $$

The goal of the first few operations are to convert the first column so that there are only $0$'s below the first $1$.

  1. Firstly, we add $-2R_1$ to $R_2$
  2. Secondly, we add $R_1$ to $R_3$
$$ \left[ \begin{array}{ccc|ccc} 1 &5 &-1 &1 &0 &0\\ 2 &2 &-2 &0 &1 &0\\ -1 &4 &3 &0 &0 &1 \end{array} \right] R_2 \rightarrow -2R_1 + R_2 \left[ \begin{array}{ccc|ccc} 1 &5 &-1 &1 &0 &0\\ 0 &-8 &0 &-2 &1 &0\\ -1 &4 &3 &0 &0 &1 \end{array} \right] $$
$$ \left[ \begin{array}{ccc|ccc} 1 &5 &-1 &1 &0 &0\\ 0 &-8 &0 &-2 &1 &0\\ -1 &4 &3 &0 &0 &1 \end{array} \right] R_3 \rightarrow R_1 + R_3 \left[ \begin{array}{ccc|ccc} 1 &5 &-1 &1 &0 &0\\ 0 &-8 &0 &-2 &1 &0\\ 0 &9 &2 &1 &0 &1 \end{array} \right] $$

For the second row, we can see that there is no $1$ in the row, so we have to change that. And secondly, we also need there to be only $0$'s below and to the left of the first $1$ in the second row.

  1. Firstly, we divide $R2$ by $-8$
  2. Secondly, we add $-9R_2$ to $R_3$
$$ \left[ \begin{array}{ccc|ccc} 1 &5 &-1 &1 &0 &0\\ 0 &-8 &0 &-2 &1 &0\\ 0 &9 &2 &1 &0 &1 \end{array} \right] R_2 \rightarrow \frac{1}{-8}R_2 \left[ \begin{array}{ccc|ccc} 1 &5 &-1 &1 &0 &0\\ 0 &1 &0 &0.25 &-0.125 &0\\ 0 &9 &2 &1 &0 &1 \end{array} \right] $$
$$ \left[ \begin{array}{ccc|ccc} 1 &5 &-1 &1 &0 &0\\ 0 &1 &0 &0.25 &-0.125 &0\\ 0 &9 &2 &1 &0 &1 \end{array} \right] R_3 \rightarrow -9R_2 + R_3 \left[ \begin{array}{ccc|ccc} 1 &5 &-1 &1 &0 &0\\ 0 &1 &0 &0.25 &-0.125 &0\\ 0 &0 &2 &-1.25 &1.125 &1 \end{array} \right] $$

Next, we take care of the last column that also has no $1$. We do this by dividing $R_3$ by $2$.

$$ \left[ \begin{array}{ccc|ccc} 1 &5 &-1 &1 &0 &0\\ 0 &1 &0 &0.25 &-0.125 &0\\ 0 &0 &2 &-1.25 &1.125 &1 \end{array} \right] R_3 \rightarrow \frac{1}{2}R_3 \left[ \begin{array}{ccc|ccc} 1 &5 &-1 &1 &0 &0\\ 0 &1 &0 &0.25 &-0.125 &0\\ 0 &0 &1 &-0.625 & 0.5625 &0.5 \end{array} \right] $$

Let me remind you of rule 4 from the Reduced Row Echelon Form section.

The first $1$ in a row only contains $0$'s in the same entry in all the rows above the $1$. In other words, the first $1$ in a row only contains $0$'s in that same column for all the rows above the $1$.

This means we need to get rid of the $5$ and $-1$ in the first row. In the following steps, we also convert some of the entries to fractions instead of decimals to better fit them on the screen.

  1. Firstly, we add $R_3$ to $R_1$
  2. Secondly, we add $-5R_2$ to $R_1$
$$ \left[ \begin{array}{ccc|ccc} 1 &5 &-1 &1 &0 &0\\ 0 &1 &0 &0.25 &-0.125 &0\\ 0 &0 &1 &-0.625 & 0.5625 &0.5 \end{array} \right] R_1 \rightarrow R_3 + R_1 \left[ \begin{array}{ccc|ccc} 1 &5 &0 &\frac{3}{8} &\frac{9}{16} &\frac{1}{2}\\ 0 &1 &0 &\frac{1}{4} &-\frac{1}{8} &0\\ 0 &0 &1 &-\frac{5}{8} & \frac{9}{16} &\frac{1}{2} \end{array} \right] $$
$$ \left[ \begin{array}{ccc|ccc} 1 &5 &0 &\frac{3}{8} &\frac{9}{16} &\frac{1}{2}\\ 0 &1 &0 &\frac{1}{4} &-\frac{1}{8} &0\\ 0 &0 &1 &-\frac{5}{8} & \frac{9}{16} &\frac{1}{2} \end{array} \right] R_1 \rightarrow -5R_2 + R_1 \left[ \begin{array}{ccc|ccc} 1 &0 &0 &-\frac{7}{8} &1\frac{3}{16} &\frac{1}{2}\\ 0 &1 &0 &\frac{1}{4} &-\frac{1}{8} &0\\ 0 &0 &1 &-\frac{5}{8} & \frac{9}{16} &\frac{1}{2} \end{array} \right] $$

And that's it, we have converted matrix $B$ to reduced row echelon form, which means the identity matrix $I$ has been converted to the inverse of $B$.

$$ inverse(B) = \begin{bmatrix} -\frac{7}{8} &1\frac{3}{16} &\frac{1}{2}\\ \frac{1}{4} &-\frac{1}{8} &0\\ -\frac{5}{8} & \frac{9}{16} &\frac{1}{2} \end{bmatrix} = \begin{bmatrix} -0.875 & 1.1875 &0.5\\ 0.25 &-0.125 &0\\ -0.625 & 0.5625 &0.5 \end{bmatrix} $$

With very basic programming skills, you can use Python with the package NumPy to perform the inverse operation quite easily.

import numpy as np

B = np.array([[1,5,-1],[2,2,-2],[-1,4,3]])

inv = np.linalg.inv(B)

print(inv)

This yields the same result we got from our row operations, and it can be a good tool to check your results when doing elimination by hand.

array([[-0.875 ,  1.1875,  0.5   ],
       [ 0.25  , -0.125 ,  0.    ],
       [-0.625 ,  0.5625,  0.5   ]])

References

A good video to understand the general concept of ranks, determinants, and inverse matrices:

Another example of how to find the inverse of a matrix by Khan Academy: