What is a vector?
What is vector addition?
What are scalars?

This is the first in a series of fundamentals of Linear Algebra. The above questions is what I will focus on illustrating to you. At the end of the series, there will be a recap of how the Linear Algebra concepts you learned here, relates to Machine Learning. And after, a smaller project which uses some of the most important concepts from here.


Well firstly, how could we define a vector? In mathematics, we can think of a vector as some arrow in a coordinate system. Such a vector has vector coordinates, where the first vertical number is corresponding to x and the second corresponding to y. The notation for the below vector would be $\vec{v}=\begin{bmatrix}2\\1\end{bmatrix}$.

We could also read it as "vector v goes 2 along the x-axis and 1 along the y-axis". It is almost always assumed that you draw from the origin, which is (0,0).

Basic vector drawn from origin $(0,0)$ to the point $(2,1)$, that forms the vector $\vec{v}=\begin{bmatrix}2\\1\end{bmatrix}$

We can transfer the above directly to Computer Science, where you could use the data type List, which contains n amount of countable values, and these values are indexed from $0$ to n. If we had a list of vector coordinates, but in this notation and order, $(2,4)$, $(2, -4)$ and $(4,0)$, we could use the list to plot vectors on a coordinate system. Notice that the first number indicates x and the second indicates y. This could continue as we add dimensions. In this Python code, we cheated a fair bit, as we are not directly using the mathematical syntax.

import numpy as np
import matplotlib.pyplot as plt

soa = np.array([[0, 0, 2, 4], [2, 4, 2, -4], [0, 0, 4, 0]])
X, Y, U, V = zip(*soa)
plt.quiver(X, Y, U, V, angles='xy', scale_units='xy', scale=1)
plt.show()

Quick explanation: We create an outer array with three inner arrays, we and fill the inner arrays with corresponding values to [startX, startY, endX, endY]. startX and startY is the first point, and endX and endY is the second point. We specify to the method .quiver that we use X,Y,U,V and then it gets plotted, which yields the below result:

The outcome of the Python script

The next incredibly important thing is called Vector Addition. Suppose we have two vectors $\vec{u}=\begin{bmatrix}2\\1\end{bmatrix}$ and $\vec{v}=\begin{bmatrix}2\\-2\end{bmatrix}$.

How would we add these two vectors? It's quite simple. Both vectors have an x and y value, corresponding to the first and second number in the vector. We would add them like this:

$$ \vec{u} + \vec{v} = \begin{bmatrix} x_1 + x_2\\ y_1 + y_2 \end{bmatrix} $$

Or more generally we could have a near infinite number of vectors going way further than just 2-dimensional:

$$ \vec{u} + \vec{v} = (u_1+v_1,u_2+v_2, ..., u_n+v_n) $$

Theoretically, we could present vector addition in an $m \times n$ matrix. In such a matrix, we could perform the vector addition operation or any other operation in linear algebra:

$$ \begin{bmatrix} v_{11} + v_{12} + \cdots + v_{1n}\\ v_{21} + v_{22} + \cdots + v_{2n}\\ \vdots \quad\enspace\,\,\, \vdots \quad \ddots \quad\, \vdots\\ v_{m1} + v_{m2} + \cdots + v_{mn} \end{bmatrix} $$

It would certainly be impossible to plot a 1000-dimensional vector, but it would be possible to use data sets that ressembles that many dimensions. If we were to stay in this 2-dimensional space, we could just continue to plot points and draw vectors from one point to another.

Here is an illustrative example with the two vectors $\vec{u}=\begin{bmatrix}2\\1\end{bmatrix}$ and $\vec{v}=\begin{bmatrix}2\\-2\end{bmatrix}$.

Notice how we have placed the two vectors now. We have a tail and head, where tail is the start of the vector and head the end of the vector. So we have taken $\vec{v}$'s tail and placed it at $\vec{u}$'s head.

What do you think the sum of $\vec{u} + \vec{v}$ is? The sum is the vector from u's tail to v's head. How could we get the exact points without even having this image? We could just add them up using vector addition; from this we get a new vector which we will call $\vec{w}$:

$$ \vec{w} = \vec{u} + \vec{v} = (u_1+v_1,u_2+v_2) = (2+2,1+(-2)) = (4, -1) $$

BUT, one important rule to remember is that you cannot do this without having two vectors where the second vector is an extension in some direction, from the first vector.

Now, let's introduce scalars. A scalar is any real number. We can use a scalar to multiply into vector coordinates. Essentially we are scaling some vector by some multiplier:

$$ \vec{v} = 3 \begin{bmatrix} 2\\ 1 \end{bmatrix} = \begin{bmatrix} 3(2)\\ 3(1) \end{bmatrix} = \begin{bmatrix} 6\\ 3 \end{bmatrix} $$

We used the real number 3, but it could also be $2/3$ or -3. This means that that the vectors can reach any point in the direction the vector is pointing, if we can scale to it. It is one of the most used operations in Linear Algebra, and in the next post, I will explain how we can extend the meaning of scalars.


Summary (answering the questions at the top):

  1. What is a vector?
    An arrow in space, defined by vector coordinates. A single vector can be plotted in a coordinate system, or be denoted in a matrix. A matrix could look like this:
    $\begin{bmatrix}3\\2\end{bmatrix}$, where 3 and 2 can be represented as the vector coordinates x and y in a coordinate system.
  2. What is vector addition?
    Vector addition is the operation of adding vectors together, or more precisely, the vector coordinates of each vector that you want to perform this operation on. The definition of vector addition of 2-dimensional space, of 2 vectors, is as follows:
    $$\vec{u} + \vec{v} = (u_1+v_1,u_2+v_2, ..., u_n+v_n)$$
    This operation does work with infinite vectors and dimensions.
  3. What are scalars?
    A Scalar is a any real number we can multiply into a vector, which has vector coordinates. The operation can easily be performed in a matrix:
    $$\vec{v}=3\begin{bmatrix}2\\1\end{bmatrix} = \begin{bmatrix}3(2)\\3(1)\end{bmatrix} = \begin{bmatrix}6\\3\end{bmatrix}$$