6. Linear Transformations#

You will already be familiar with the use of functions in mathematics to study sets and the mapping from a set of inputs set to another set of outputs. The notation used to denote functions is of the form \(f: X \to Y\) where \(f\) is the name of the function, \(X\) is the set of inputs known as the domain and \(Y\) is a member of the set of outputs known as the codomain. The mapping which defines the relationship between the domain and codmain is defined using \(y = f(x)\) where \(x\) is a member of the domain and \(y\) is a member of the codomain.

In linear algebra we study the linear mapping of a set of vectors to another set of vectors so we define \(T: V \to W\) where \(V\) and \(W\) are vector spaces and the mapping from an input vector \(\vec{u} \in V\) to an output vector \(\vec{w} \in W\) is given by \(\vec{w} = T(\vec{u})\).

Linear transformations have lots of uses in mathematics and computing. A good example is in the field of computer graphics and computer games where they are fundamental to the manipulation and visualisation of three-dimensional objects.

We begin with the formal definition of a linear transformation.

Definition 6.1 (Linear transformation)

If \(V\) and \(W\) are two vector spaces over the same field \(F\) then by a linear transformation (or linear mapping) is a mapping \(T: V \to W\) that for any two vectors \(\vec{u}, \vec{v} \in V\) and any scalar \(\alpha \in F\) the following conditions hold

  • addition operation: \(T(\vec{u} + \vec{v}) = T(\vec{u}) + T(\vec{v})\);

  • scalar multiplication: \(T(\alpha \vec{u}) = \alpha T(\vec{u})\).

The result of applying a linear transformation to an object is known as the image.

For example, let \(V = \mathbb{R}^2\) and \(W = \mathbb{R}^3\) then \(T : V \to W\) defined by \(T : (x, y)^\mathsf{T} \mapsto (x, y, 0)^\mathsf{T}\) is a linear transformation. Let \(\vec{u} = (u_1, u_2)^\mathsf{T}, \vec{v} = (v_1, v_2)^\mathsf{T} \in \mathbb{R}^2\) and \(\alpha \in \mathbb{R}\) then

\[\begin{split} \begin{align*} T (\vec{u} + \vec{v}) &= T \begin{pmatrix} u_1 + v_1 \\ u_2 + v_2 \end{pmatrix} = \begin{pmatrix} u_1 + v_1 \\ u_2 + v_2 \\ 0 \end{pmatrix}, \\ T (\vec{u}) + T(\vec{v}) &= \begin{pmatrix} u_1 \\ u_2 \\ 0 \end{pmatrix} + \begin{pmatrix} v_1 \\ v_2 \\ 0 \end{pmatrix} = \begin{pmatrix} u_1 + v_1 \\ u_2 + v_2 \\ 0 \end{pmatrix}, \end{align*} \end{split}\]

so \(T(\vec{u} + \vec{v}) = T(\vec{u}) + T(\vec{v})\) and the addition condition is satisfied. Similarly

\[\begin{split} \begin{align*} T(\alpha \vec{u}) &= T \begin{pmatrix} \alpha u_1 \\ \alpha u_2 \end{pmatrix} = \begin{pmatrix} \alpha u_1 \\ \alpha u_2 \\ 0 \end{pmatrix}, \\ \alpha T(\vec{u}) &= \alpha \begin{pmatrix} u_1 \\ u_2 \\ 0 \end{pmatrix} = \begin{pmatrix} \alpha u_1 \\ \alpha u_2 \\ 0 \end{pmatrix}, \end{align*} \end{split}\]

so \(T(\alpha \vec{u}) = \alpha T(\vec{u})\) and the scalar multiplication condition is satisfied and combined with the addition condition we have shown that \(T\) is a linear transformation. We can combine the addition and scalar multiplication conditions to give a single condition.

Definition 6.2 (Linear transformation condition)

A transformation \(T : V \to W\) is a linear transformation if the following condition is satisfied for any \(\vec{u}, \vec{v} \in V\) and \(\alpha \in F\)

(6.1)#\[ T(\vec{u} + \alpha \vec{v}) = T(\vec{u}) + \alpha T(\vec{v}). \]

Example 6.1

Determine which of the following transformations are linear transformations

(i)   \(T: \mathbb{R}^3 \to \mathbb{R}^2\) defined by \(T: (x, y, z)^\mathsf{T} \mapsto (x, y)^\mathsf{T}\)

Solution (click to show)

Let \(\vec{u} = (u_1, u_2, u_3)^\mathsf{T}, \vec{v} = (v_1, v_2, v_3)^\mathsf{T} \in \mathbb{R}^3\) and \(\alpha \in \mathbb{R}\) then

\[\begin{split} \begin{align*} T(\vec{u} + \alpha \vec{v}) &= T\left( \begin{pmatrix} u_1 \\ u_2 \\ u_3 \end{pmatrix} + \alpha \begin{pmatrix} v_1 \\ v_2 \\ v_3 \end{pmatrix} \right) = T \begin{pmatrix} u_1 + \alpha v_1 \\ u_2 + \alpha v_2 \\ u_3 + \alpha v_3 \end{pmatrix} = \begin{pmatrix} u_1 + \alpha v_1 \\ u_2 + \alpha v_2 \end{pmatrix}, \\ T(\vec{u}) + \alpha T(\vec{v}) &= T \begin{pmatrix} u_1 \\ u_2 \\ u_3 \end{pmatrix} + \alpha T \begin{pmatrix} v_1 \\ v_2 \\ v_3 \end{pmatrix} = \begin{pmatrix} u_1 \\ u_2 \end{pmatrix} + \alpha \begin{pmatrix} v_1 \\ v_2 \end{pmatrix} = \begin{pmatrix} u_1 + \alpha v_1 \\ u_2 + \alpha v_2 \end{pmatrix}. \end{align*} \end{split}\]

Since \(T(\vec{u} + \alpha \vec{v}) = T(\vec{u}) + \alpha T(\vec{v})\) then \(T: (x, y, z)^\mathsf{T} \mapsto (x, y)^\mathsf{T}\) is a linear transformation.

(ii)   \(T: \mathbb{R}^3 \to \mathbb{R}^2\) defined by \(T: (x, y, z)^\mathsf{T} \mapsto (x + 3, y)^\mathsf{T}\)

Solution (click to show)

Let \(\vec{u} = (u_1, u_2, u_3)^\mathsf{T}, \vec{v} = (v_1, v_2, v_3)^\mathsf{T} \in \mathbb{R}^3\) and \(\alpha \in \mathbb{R}\) then

\[\begin{split} \begin{align*} T(\vec{u} + \alpha \vec{v}) &= T \begin{pmatrix} u_1 + \alpha v_1 \\ u_2 + \alpha v_2 \\ u_3 + \alpha v_3 \end{pmatrix} = \begin{pmatrix} u_1 + \alpha v_1 + 3 \\ u_2 + \alpha v_2 \end{pmatrix}, \\ T(\vec{u}) + \alpha T(\vec{v}) &= T \begin{pmatrix} u_1 \\ u_2 \\ u_3 \end{pmatrix} + \alpha T \begin{pmatrix} v_1 \\ v_2 \\ v_3 \end{pmatrix} = \begin{pmatrix} u_1 + 3 \\ u_2 \end{pmatrix} + \alpha \begin{pmatrix} v_1 + 3 \\ v_2 \end{pmatrix} = \begin{pmatrix} u_1 + v_1 + 3 + 3\alpha \\ u_2 + \alpha v_2 \end{pmatrix} \end{align*} \end{split}\]

Since \(T(\vec{u} + \alpha \vec{v}) \neq T(\vec{u}) + \alpha T(\vec{v})\) then \(T: (x, y, z)^\mathsf{T} \mapsto (x + 3, y)^\mathsf{T}\) is not a linear transformation.

Note that we could have shown this by a counterexample, e.g., let \(\vec{u} = ( 1, 0 , 0 )^\mathsf{T}, \vec{v} = (2, 0, 0)^\mathsf{T} \in \mathbb{R}^3\) then

\[\begin{split} \begin{align*} T(\vec{u} + \vec{v}) &= T \begin{pmatrix} 3 \\ 0 \\ 0 \end{pmatrix} = \begin{pmatrix} 6 \\ 0 \end{pmatrix}, \\ T(\vec{u}) + T(\vec{v}) &= \begin{pmatrix} 4 \\ 0 \\ 0 \end{pmatrix} + \begin{pmatrix} 5 \\ 0 \\ 0 \end{pmatrix} = \begin{pmatrix} 9 \\ 0 \end{pmatrix}. \end{align*} \end{split}\]

(iii)   \(T: P(\mathbb{R}) \to P(\mathbb{R})\) defined by \(T: p \mapsto p \dfrac{\mathrm{d}p}{\mathrm{d}x}\)

Solution (click to show)

Let \(u = x \in P(\mathbb{R})\) then

\[\begin{split} \begin{align*} T(2u) &= T(2x) = 2x(2) = 4x, \\ 2T(u) &= 2T(x) = 2(x)(1) = 2x, \end{align*} \end{split}\]

therefore \(T(2u) \neq 2T(u)\) and \(T: p \mapsto p \dfrac{\mathrm{d}p}{\mathrm{d}x}\) is not a linear transformation.

6.1. Transformation matrices#

For convenience we tend to use matrices to represent linear transformations. Let \(T: V \to W\) be a linear transformation from the vector spaces \(V\) to \(W\) where \(V, W \in \mathbb{R}^n\). If \(\{\vec{v}_1, \vec{v}_2, \ldots, \vec{v}_n\}\) is a basis for \(V\) then for a vector \(\vec{u} \in V\)

\[ \vec{u} = u_1 \vec{v}_1 + u_2 \vec{v}_2 + \cdots + u_n \vec{v}_n, \]

and by the definition of a linear transformation we can apply a linear transformation \(T\) to the vectors \(\vec{u}\) and \(\vec{v}_1, \vec{v}_2, \ldots, \vec{v}_n\)

\[ T(\vec{u}) = u_1 T(\vec{v}_1) + u_2 T(\vec{v}_2) + \cdots + u_n T(\vec{v}_n), \]

so \(T(\vec{u})\) depends on the vectors \(T(\vec{v}_1), T(\vec{v}_2), \ldots, T(\vec{v}_n)\). We can write this as the matrix equation

\[\begin{split} \begin{align*} T(\vec{u}) &= \begin{pmatrix} \uparrow & \uparrow & & \uparrow \\ T(\vec{v}_1) & T(\vec{v}_2) & \cdots & T(\vec{v}_n) \\ \downarrow & \downarrow & & \downarrow \end{pmatrix} \begin{pmatrix} u_1 \\ u_2 \\ \vdots \\ u_n \end{pmatrix} = A \vec{u}. \end{align*} \end{split}\]

In other words we can apply a linear transformation simply by multiplying \(\vec{u}\) by a matrix \(A\).

Definition 6.3 (Transformation matrix)

Let \(T : V \to W\) be a linear transformation and \(A\) be a matrix such that

\[\begin{split} A = \begin{pmatrix} \uparrow & \uparrow & & \uparrow \\ T(\vec{v}_1) & T(\vec{v}_2) & \cdots & T(\vec{v}_n) \\ \downarrow & \downarrow & & \downarrow \end{pmatrix} \end{split}\]

then

\[ T(\vec{u}) = A\vec{u}. \]

\(A\) is said to be the matrix representation of the linear transformation \(T\) (also known as the transformation matrix).

Example 6.2

A linear transformation \(T:\mathbb{R}^2 \to \mathbb{R}^2\) is defined by \(T: (x, y)^\mathsf{T} \mapsto (3x + y, x + 2y)^\mathsf{T}\). Calculate the transformation matrix and use it to calculate \(T(1,1)^\mathsf{T}\).

Solution (click to show)

Since we are mapping from \(\mathbb{R}^2\) the transformation matrix is

\[ A = \begin{pmatrix} T(\vec{e}_1) & T(\vec{e}_2) \end{pmatrix} \]

Applying the transformation to the standard basis vectors

\[\begin{split} \begin{align*} T(\vec{e}_1) = T\begin{pmatrix} 1 \\ 0 \end{pmatrix} = \begin{pmatrix} 3(1) + 0 \\ 1 + 2(0) \end{pmatrix} = \begin{pmatrix} 3 \\ 1 \end{pmatrix}, \\ T(\vec{e}_2) = T\begin{pmatrix} 0 \\ 1 \end{pmatrix} = \begin{pmatrix} 3(0) + 1 \\ 0 + 2(1) \end{pmatrix} = \begin{pmatrix} 1 \\ 2 \end{pmatrix}, \end{align*} \end{split}\]

so the transformation matrix is

\[\begin{split} A = \begin{pmatrix} 3 & 1 \\ 1 & 2 \end{pmatrix}. \end{split}\]

Applying the transformation matrix to \((1, 1)^\mathsf{T}\)

\[\begin{split} T\begin{pmatrix} 1 \\ 1 \end{pmatrix} = A \cdot \begin{pmatrix} 1 \\ 1 \end{pmatrix} = \begin{pmatrix} 3 & 1 \\ 1 & 2 \end{pmatrix} \begin{pmatrix} 1 \\ 1 \end{pmatrix} = \begin{pmatrix} 4 \\ 3 \end{pmatrix}. \end{split}\]

The affects of the linear transformation from Example 6.1 is illustrated in Fig. 6.1. Note that the transformation \(T\) can be thought of as changing the basis of the vector space. The unit square with respect to the basis \(\{\vec{e}_1, \vec{e}_1\}\) has been transformed into a unit parallelogram with respect to the basis \(\{ T(\vec{e}_1), T(\vec{e}_2)\}\).

../_images/6_linear_transformation.svg

Fig. 6.1 The affect of applying a linear transformation \(T: (x,y)^\mathsf{T} \mapsto (3x + y, x + 2y)^\mathsf{T}\) to the vector \((1,1)^\mathsf{T}\).#

6.2. Finding the transformation matrix from a set of images#

The calculation of the transformation matrix in Example 6.2 was straightforward as we knew what the transformation was. This will not always be a the case and we may know what the output of the transformation (known as the image) is but not the transformation itself. Consider a linear transformation \(T: \mathbb{R}^n \to \mathbb{R}^m\) applied to vectors \(\vec{u}_1, \vec{u}_2, \ldots, \vec{u}_n\). If \(A\) is the transformation matrix for \(T\) then

\[\begin{split} \begin{align*} A \begin{pmatrix} \uparrow & \uparrow & & \uparrow \\ \vec{u}_1 & \vec{u}_2 & \cdots & \vec{u}_n \\ \downarrow & \downarrow & & \downarrow \end{pmatrix} = \begin{pmatrix} \uparrow & \uparrow & & \uparrow \\ T(\vec{u}_1) & T(\vec{u}_2) & \cdots & T(\vec{u}_n) \\ \downarrow & \downarrow & & \downarrow \end{pmatrix} \end{align*} \end{split}\]

therefore

\[\begin{split} \begin{align*} A &= \begin{pmatrix} \uparrow & \uparrow & & \uparrow \\ T(\vec{u}_1) & T(\vec{u}_2) & \cdots & T(\vec{u}_n) \\ \downarrow & \downarrow & & \downarrow \end{pmatrix} \begin{pmatrix} \uparrow & \uparrow & & \uparrow \\ \vec{u}_1 & \vec{u}_2 & \cdots & \vec{u}_n \\ \downarrow & \downarrow & & \downarrow \end{pmatrix}^{-1} \end{align*} \end{split}\]

Theorem 6.1 (Determining the linear transformation given the inputs and image vectors)

Given a linear transformation \(T: \mathbb{R}^n \to \mathbb{R}^m\) applied to a set of \(n\) vectors \(\vec{u}_1, \vec{u}_2, \ldots, \vec{u}_n\) with known image vectors \(T(\vec{u}_1), T(\vec{u}_2), \ldots, T(\vec{u}_n)\) then the transformation matrix \(A\) for \(T\) is

(6.2)#\[ A = \begin{pmatrix} T(\vec{u}_1) & T(\vec{u}_2) & \cdots & T(\vec{u}_n) \end{pmatrix} \begin{pmatrix} \vec{u}_1 & \vec{u}_2 & \cdots & \vec{u}_n \end{pmatrix}^{-1}. \]

Example 6.3

Determine the transformation matrix \(A\) for the linear transformation \(T\) such that

\[\begin{split} \begin{align*} T\begin{pmatrix} 1 \\ 1 \end{pmatrix} &= \begin{pmatrix} 4 \\ 3 \end{pmatrix}, & T\begin{pmatrix} 1 \\ 2 \end{pmatrix} &= \begin{pmatrix} 5 \\ 5 \end{pmatrix}. \end{align*} \end{split}\]
Solution (click to show)

The inverse of \((\vec{u}_1, \vec{u}_2)\) is

\[\begin{split} \begin{align*} \begin{pmatrix} 1 & 1 \\ 1 & 2 \end{pmatrix}^{-1} &= \begin{pmatrix} 2 & -1 \\ -1 & 1 \end{pmatrix} \end{align*} \end{split}\]

Right multiplying the image matrix

\[\begin{split} \begin{align*} A &= \begin{pmatrix} 4 & 5 \\ 3 & 5 \end{pmatrix} \begin{pmatrix} 2 & -1 \\ -1 & 1 \end{pmatrix} = \begin{pmatrix} 3 & 1 \\ 1 & 2 \end{pmatrix}. \end{align*} \end{split}\]

This is the transformation matrix from Example 6.2.

6.3. Inverse linear transformation#

Definition 6.4 (Inverse linear transformation)

Let \(T: V \to W\) be a linear transformation with the transformation matrix \(A\) then \(T\) has an inverse transformation denoted by \(T^{-1}: W \to V\) which reverses the affects of \(T\). If \(\vec{u} \in V\) and \(\vec{v} \in W\) then

\[\begin{split} \begin{align*} \vec{v} &= A \vec{u} \\ \therefore \vec{u} & = A^{-1}\vec{v}, \end{align*} \end{split}\]

where \(A^{-1}\) is the transformation matrix for \(T^{-1}\).

Example 6.4

Determine the inverse of the transformation \(T: \mathbb{R}^2 \to \mathbb{R}^2\) defined by \(T(x, y)^\mathsf{T} \mapsto (3 x + y, x + 2 y)^\mathsf{T}\) and calculate \(T^{-1}(4,3)^\mathsf{T}\).

Solution (click to show)

We saw in Example 6.2 that the transformation matrix for \(T\) is

\[\begin{split} A = \begin{pmatrix} 3 & 1 \\ 1 & 2 \end{pmatrix}, \end{split}\]

which has the inverse

\[\begin{split} A^{-1} = \frac{1}{5} \begin{pmatrix} 2 & -1 \\ -1 & 3 \end{pmatrix}. \end{split}\]

Determining the inverse transformation

\[\begin{split} \begin{align*} T^{-1}\begin{pmatrix} x \\ y \end{pmatrix} &= A^{-1} \cdot \begin{pmatrix} x \\ y \end{pmatrix} = \frac{1}{5} \begin{pmatrix} 2 & -1 \\ -1 & 3 \end{pmatrix} \begin{pmatrix} x \\ y \end{pmatrix} \\ &= \begin{pmatrix} \frac{2}{5}x - \frac{1}{5}y \\ -\frac{1}{5}x + \frac{3}{5}y \end{pmatrix}. \end{align*} \end{split}\]

Calculating \(T^{-1}\begin{pmatrix} 4 \\ 3 \end{pmatrix}\)

\[\begin{split} \begin{align*} A^{-1} \begin{pmatrix} 4 \\ 3 \end{pmatrix} &= \frac{1}{5} \begin{pmatrix} 2 & -1 \\ -1 & 3 \end{pmatrix} \begin{pmatrix} 4 \\ 3 \end{pmatrix} = \begin{pmatrix} 1 \\ 1 \end{pmatrix}. \end{align*} \end{split}\]