QR decomposition

6.5. QR decomposition#

QR decomposition is another decomposition method that factorises a matrix in the product of two matrices \(Q\) and \(R\) such that

\[ \begin{align*} A = QR, \end{align*} \]

where \(Q\) is an orthogonal matrix and \(R\) is an upper triangular matrix.

Definition 6.2 (Orthogonal vectors)

A set of vectors \(\lbrace \vec{v}_1 ,\vec{v}_2 ,\vec{v}_3 ,\dots \rbrace\) is said to be orthogonal if \(\vec{v}_i \cdot \vec{v}_j =0\) for \(i\not= j\). Furthermore the set is said to be orthonormal if \(\vec{v}_i\) are all unit vectors.

Definition 6.3 (Orthogonal matrix)

An orthogonal matrix is a matrix where the columns are a set of orthonormal vectors. If \(A\) is an orthogonal matrix if

\[ \begin{align*} A^\mathrm{T} A=I. \end{align*} \]

For example consider the matrix

\[\begin{split} \begin{align*} A= \begin{pmatrix} 0.8 & -0.6\\ 0.6 & 0.8 \end{pmatrix}. \end{align*} \end{split}\]

This is an orthogonal matrix since

\[\begin{split} \begin{align*} A^\mathrm{T} A=\begin{pmatrix} 0.8 & 0.6\\ -0.6 & 0.8 \end{pmatrix} \begin{pmatrix} 0.8 & -0.6\\ 0.6 & 0.8 \end{pmatrix} = \begin{pmatrix} 1 & 0\\ 0 & 1 \end{pmatrix}=I. \end{align*} \end{split}\]

Futhermore it is an orthonormal matrix since the magnitude of the columns for \(A\) are both 1.

6.5.1. QR decomposition using the Gram-Schmidt process#

Consider the \(3 \times 3\) matrix \(A\) represented as the concatenation of the column vectors \(\vec{a}_1, \vec{a}_2, \vec{a}_3\)

\[\begin{split} \begin{pmatrix} a_{11} & a_{12} & a_{13} \\ a_{21} & a_{22} & a_{23} \\ a_{31} & a_{32} & a_{33} \end{pmatrix} = \begin{pmatrix} \uparrow & \uparrow & \uparrow \\ \vec{a}_1 & \vec{a}_2 & \vec{a}_3 \\ \downarrow & \downarrow & \downarrow \end{pmatrix} \end{split}\]

where \(\vec{a}_i = (a_{1i}, a_{2i}, a_{3i})^\mathrm{T}\). To calculate the orthogonal \(Q\) matrix we need a set of vectors \(\{\vec{q}_1, \vec{q}_2, \vec{q}_3\}\) that span the same space as {\(\vec{a}_1, \vec{a}_2, \vec{a}_3\}\) but are orthogonal to each other. One way to do this is using the Gram-Schmidt process. Consider the diagram shown in Fig. 6.1 that shows the first two non-orthogonal vectors \(\vec{a}_1\) and \(\vec{a}_2\).

../_images/gram_schmidt.svg — Fig. 6.1 The Gram-Schmidt process#

We begin by letting \(\vec{u}_1 = \vec{a}_1\) and seek to find a vector \(\vec{u}_2\) that is orthogonal to \(\vec{u}_1\) and in the same span as \(\vec{a}_2\). To do this we subtract the vector projection of \(\vec{a}_2\) onto \(\vec{u}_1\), i.e.,

\[ \vec{u}_2 = \vec{a}_2 - \left( \frac{\vec{a}_2 \cdot \vec{u}_1}{\vec{u}_1 \cdot \vec{u}_1} \right) \vec{u}_1.\]

We can simplify things by normalising \(\vec{u}_1\) such that \(\vec{q}_1 = \vec{u}_1 / \\\\|\vec{u}_1\\|\) so we have

\[ \vec{u}_2 = \vec{a}_2 - (\vec{a}_2 \cdot \vec{q}_1) \vec{q}_1, \]

which is also normalised to give \(\vec{q}_2 = \vec{u}_2 / \|\vec{u}_2\|\). For the next vector \(\vec{a}_3\) we want the vector \(\vec{u}_3\) to be orthogonal to both \(\vec{q}_1\) and \(\vec{q}_2\) so we subtract both the vector projections of \(\vec{a}_3\) onto \(\vec{q}_1\) and \(\vec{q}_2\)

\[ \vec{u}_3 = \vec{a}_3 - (\vec{a}_3 \cdot \vec{q}_1) \vec{q}_1 - (\vec{a}_3 \cdot \vec{q}_2) \vec{q}_2, \]

which is normalised to give \(\vec{q}_3\). Rerranging the expressions for \(\vec{u}_1\), \(\vec{u}_2\) and \(\vec{u}_3\) gives

\[\begin{split} \begin{align*} \vec{a}_1 &= \vec{u}_1, \\ \vec{a}_2 &= (\vec{a}_2 \cdot \vec{q}_1) \vec{q}_1 + \vec{u}_2, \\ \vec{a}_3 &= (\vec{a}_3 \cdot \vec{q}_1) \vec{q}_1 + (\vec{a}_3 \cdot \vec{q}_2)\vec{q}_2 + \vec{u}_3 \end{align*} \end{split}\]

and since \(\vec{u}_i = (\vec{a}_i \cdot \vec{q}_i) \vec{q}_i\) (the projection of \(\vec{u}_i\) onto a unit vector pointing in the same direction at \(\vec{u}_i\)) then

\[\begin{split} \begin{align*} \vec{a}_1 &= (\vec{a}_1 \cdot \vec{q}_1) \vec{q}_1, \\ \vec{a}_2 &= (\vec{a}_2 \cdot \vec{q}_1) \vec{q}_1 + (\vec{a}_2 \cdot \vec{q}_2) \vec{q}_2, \\ \vec{a}_3 &= (\vec{a}_3 \cdot \vec{q}_1) \vec{q}_1 + (\vec{a}_3 \cdot \vec{q}_2) \vec{q}_2 + (\vec{a}_3 \cdot \vec{q}_3) \vec{q}_3 \end{align*} \end{split}\]

The matrix \(A\) can be written as the product of two matrices

\[\begin{split} \begin{pmatrix} \uparrow & \uparrow & \uparrow \\ \vec{a}_1 & \vec{a}_2 & \vec{a}_3 \\ \downarrow & \downarrow & \downarrow \end{pmatrix} = \begin{pmatrix} \uparrow & \uparrow & \uparrow \\ \vec{q}_1 & \vec{q}_2 & \vec{q}_3 \\ \downarrow & \downarrow & \downarrow \end{pmatrix} \left( \begin{array}{ccc} (\vec{q}_1 \cdot \vec{a}_1) & (\vec{q}_1 \cdot \vec{a}_2) & (\vec{q}_1 \cdot \vec{a}_3) \\ 0 & (\vec{q}_2 \cdot \vec{a}_2) & (\vec{q}_2 \cdot \vec{a}_3) \\ 0 & 0 & (\vec{q}_3 \cdot \vec{a}_3) \end{array} \right) \end{split}\]

where the matrix on the left is an orthogonal matrix and the matrix on the right is an upper triangular matrix. Let \(Q\) denote the orthonormal matrix and \(R\) denote the upper triangular matrix then \(A = QR\).

Algorithm 6.7 (QR decomposition using the Gram-Schmidt process)

Inputs: An \(m \times n\) matrix \(A\).

Outputs: An \(m \times n\) orthogonal matrix \(Q\) and and \(n \times n\) upper triangular matrix \(R\) such that \(A = QR\).

\(R \gets \vec{0}_{n\times n}\)
For \(j = 1, \ldots, n\) do
- For \(i = 1, \ldots, j - 1\) do
  - \(r_{ij} \gets \vec{q}_i \cdot \vec{a}_j\)
- \(\vec{u}_j \gets \vec{a}_j - \displaystyle\sum_{i=1}^{j-1} r_{ij} \vec{q}_i\)
- \(r_{jj} \gets \|\vec{u}_j\|\)
- \(\vec{q}_j \gets \dfrac{\vec{u}_j}{r_{jj}}\)
Return \(Q = (\vec{q}_1, \vec{q}_2, \ldots , \vec{q}_n)\) and \(R\).

Example 6.8

Calculate the QR decomposition of the following matrix using the Gram-Schmidt process

\[\begin{split} \begin{align*} A = \begin{pmatrix} -1 & -1 & 1 \\ 1 & 3 & 3 \\ -1 & -1 & 5 \\ 1 & 3 & 7 \end{pmatrix}. \end{align*} \end{split}\]

Code#

The code below defines a function called qr_gramschmidt() which calculates the QR decomposition of a matrix A using the Gram-Schmidt process.

Python

def qr_gramschmidt(A):
    n = A.shape[1]
    Q, R = np.zeros(A.shape), np.zeros((n,n))
    for j in range(n):
        sum_ = 0
        for i in range(j):
            R[i,j] = np.dot(Q[:,i], A[:,j])
            sum_ += R[i,j] * Q[:,i]

        u = A[:,j] - sum_
        R[j,j] = np.linalg.norm(u)
        Q[:,j] = u / R[j,j]
    
    return Q, R

The Numpy command np.linalg.norm() calculates the magnitude of a vector.

MATLAB

function [Q, R] = qr_gramschmidt(A)

n = size(A, 2);
Q = zeros(size(A));
R = zeros(n);
for j = 1 : n
    sum_ = 0;
    for i = 1 : j - 1
        R(i,j) = dot(Q(:,i), A(:,j));
        sum_ = sum_ + R(i,j) * Q(:,i);
    end
    u = A(:,j) - sum_;
    R(j,j) = norm(u);
    Q(:,j) = u / R(j,j);
end

end

6.5.2. QR decomposition using Householder transformations#

Another method we can use to calculate the QR decomposition of a matrix is by the use of Householder transformations. Consider the diagram in Fig. 6.2 where the vector \(\vec{x}\) is reflected about the hyperplane represented by the dashed line so that it is parallel to the basis vector \(\vec{e}_1 = (1, 0, \ldots, 0)^\mathrm{T}\).

../_images/householder_1.svg — Fig. 6.2 Householder transformation: the vector \(\vec{x}\) is reflected about the dashed line so that it is parallel to the basis vector \(\vec{e}_1\).#

The Householder matrix is an orthogonal matrix that performs the Householder transformation reflection

Definition 6.4 (Householder matrix)

(6.3)#\[ H = I - 2 \vec{v} \vec{v}^\mathrm{T}, \]

where

\[\begin{split} \begin{align*} \vec{u} &= \vec{x} - \|\vec{x}\| \vec{e}_1, \\ \vec{v} &= \frac{\vec{u}}{\| \vec{u} \|}. \end{align*} \end{split}\]

If \(\vec{x}\) is nearly parallel to \(\vec{e}_1\) then the denominator in equation (6.3) is close to zero and computational rounding errors can occur. However, we can choose to reflect \(\vec{x}\) so that it is parallel to \(-\vec{e}_1\) instead.

../_images/householder_2.svg — Fig. 6.3 Householder transformation: the vector \(\vec{x}\) is reflected about the dashed line so that it is parallel to the basis vector \(-\vec{e}_1\) to avoid computational rounding errors.#

In Fig. 6.3 above, \(\vec{x}\) is to be transformed to \(-\|\vec{x}\|\vec{e}_1\) then

\[ \begin{align*} \vec{u} = \vec{x} + \|\vec{x}\| \vec{e}_1, \end{align*} \]

so in order to always reflect in the direction that gives the largest value of \(\|\vec{u}\|\) we use

\[ \begin{align*} \vec{u} = \vec{x} + \operatorname{sign}(x)\|\vec{x}\|\vec{e}_1 \end{align*} \]

where

\[\begin{split} \begin{align*} \operatorname{sign}(x) = \begin{cases} 1, & x \geq 0,\\ -1, & x < 0. \end{cases} \end{align*} \end{split}\]

To calculate the QR decomposition of an \(m \times n\) matrix \(A\) we let \(Q = I_m\) and \(R = A = (\vec{r}_1, \vec{r}_2, \ldots , \vec{r}_n)\) and use Householder transformations to transform each of the column vectors \(\vec{r}_1\) so that they are parallel to the basis vector \(\vec{e}_j\). To do this we first calculate \(\vec{v}\) using

\[\begin{split} \begin{align*} \vec{u}_1 &= \vec{r}_1 + \operatorname{sign}(r_{11})\| \vec{r}_1 \|\vec{e}_1 , \\ \vec{v}_1 &= \frac{\vec{u}_1}{\| \vec{u}_1 \|}, \end{align*} \end{split}\]

and then we calculate the Householder matrix \(H\) using equation (6.3)

\[ \begin{align*} H_1 &= I_m - 2 \vec{v}_1 \vec{v}_1^\mathrm{T}, \end{align*} \]

and apply the Householder transformation to \(Q\) and \(R\)

\[\begin{split} \begin{align*} Q &= QH_1, \\ R &= H_1 R = \begin{pmatrix} r_{11} & r_{12} & \cdots & r_{1n} \\ 0 & r_{22} & \cdots & r_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ 0 & r_{n2} & \cdots & r_{nn} \end{pmatrix}. \end{align*} \end{split}\]

The first column of \(R\), \(\vec{r}_1\), is now parallel to \(\vec{e}_1\). We now need to transform the second column of \(R\), \(\vec{r}_2\), without changing the first row and column so that it is parallel to the second basis vector \(\vec{e}_2\). So we set the first element of \(\vec{r}_2\) to zero and calculate the Householder matrix

\[\begin{split} \vec{r}_2 = \begin{pmatrix} 0 \\ r_{22} \\ \vdots \\ r_{n2} \end{pmatrix}\end{split}\]

We repeat this process for the columns in \(R\) at which point \(R\) is an upper triangular matrix. The \(Q\) matrix is the product of all of the individual Householder transformations so by definition it is an orthogonal matrix.

Algorithm 6.8 (QR decomposition using Householder transformations)

Inputs: An \(m \times n\) matrix \(A\).

Outputs: An \(n \times n\) orthogonal matrix \(Q\) and and \(n \times n\) upper triangular matrix \(R\) such that \(A = QR\).

\(Q \gets I_m\)
\(R \gets A\)
For \(j = 1, \ldots, n\) do
- \(\vec{u} \gets\) column \(j\) of \(R\) with the first \(j-1\) elements set to zero
- \(\vec{u} \gets \vec{u} + \operatorname{sign}(r_{jj})\|\vec{u}\|\vec{e}_j\)
- \(\vec{v} \gets \dfrac{\vec{u}}{\| \vec{u} \|}\)
- \(H \gets I_m - 2 \vec{v} \vec{v} ^\mathrm{T}\)
- \(Q \gets Q H\)
- \(R \gets H R\)

Example 6.9

Calculate the QR decomposition of the following matrix using the Householder transformations

\[\begin{split} \begin{align*} A=\begin{pmatrix} 1 & -4 \\ 2 & 3 \\ 2 & 2 \end{pmatrix} \end{align*} \end{split}\]

Solution (click to show)

Let \(Q=I_3\) and \(R=A\) and calculate Householder matrix for the first column of \(R\)

Column \(j = 1\): calculate the Householder matrix

\[\begin{split} \begin{align*} \vec{u} &= \vec{r}_1 + \operatorname{sign}(r_{11})\|\vec{r}_1\|\vec{e}_1 = \begin{pmatrix} 1 \\ 2 \\ 2 \end{pmatrix} + 3 \begin{pmatrix} 1 \\ 0 \\ 0 \end{pmatrix} = \begin{pmatrix} 4 \\ 2 \\ 2 \end{pmatrix}, \\ \vec{v} &= \frac{\vec{u}}{\|\vec{u}\|} = \frac{1}{2\sqrt{6}} \begin{pmatrix} 4 \\ 2 \\ 2 \end{pmatrix} = \frac{\sqrt{6}}{6} \begin{pmatrix} 2 \\ 1 \\ 1 \end{pmatrix}, \\ H &= I - 2 \vec{v} \vec{v}^\mathrm{T} = \begin{pmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{pmatrix} - \frac{1}{3} \begin{pmatrix} 4 & 2 & 2 \\ 2 & 1 & 1 \\ 2 & 1 & 1 \end{pmatrix} \\ &= \frac{1}{3} \begin{pmatrix} -1 & -2 & -2 \\ -2 & 2 & -1 \\ -2 & -1 & 2 \end{pmatrix}. \end{align*} \end{split}\]

Apply Householder transformations to \(Q\) and \(R\)

\[\begin{split} \begin{align*} Q &= QH = \begin{pmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{pmatrix} \frac{1}{3} \begin{pmatrix} -1 & -2 & -2 \\ -2 & 2 & -1 \\ -2 & -1 & 2 \end{pmatrix} = \frac{1}{3} \begin{pmatrix} -1 & -2 & -2 \\ -2 & 2 & -1 \\ -2 & -1 & 2 \end{pmatrix}, \\ R &= HR = \frac{1}{3} \begin{pmatrix} -1 & -2 & -2 \\ -2 & 2 & -1 \\ -2 & -1 & 2 \end{pmatrix} \begin{pmatrix} 1 & -4 \\ 2 & 3 \\ 2 & 2 \end{pmatrix} = \begin{pmatrix} -3 & -2 \\ 0 & 4 \\ 0 & 3 \end{pmatrix}. \end{align*} \end{split}\]

Column \(j = 2\): calculate Householder matrix

\[\begin{split} \begin{align*} \vec{u} &= \vec{r}_2 + \operatorname{sign}(r_{22})\|\vec{r}_2\|\vec{e}_2 = \begin{pmatrix} 0 \\ 4 \\ 3 \end{pmatrix} + 5 \begin{pmatrix} 0 \\ 1 \\ 0 \end{pmatrix} = \begin{pmatrix} 0 \\ 9 \\ 3 \end{pmatrix} , \\ \vec{v} &= \frac{\vec{u}}{\|\vec{u}\|} = \frac{1}{3\sqrt{10}} \begin{pmatrix} 0 \\ 9 \\ 3 \end{pmatrix} = \frac{\sqrt{10}}{10} \begin{pmatrix} 0 \\ 3 \\ 1 \end{pmatrix}, \\ H &= I - 2 \vec{v} \vec{v}^\mathrm{T} = \begin{pmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{pmatrix} - \frac{1}{5} \begin{pmatrix} 0 & 0 & 0 \\ 0 & 9 & 3 \\ 0 & 3 & 1 \end{pmatrix} \\ &= \begin{pmatrix} 5 & 0 & 0 \\ 0 & -4 & -3 \\ 0 & -3 & 4 \end{pmatrix}. \\ \end{align*} \end{split}\]

Applying the Householder transformations to \(Q\) and \(R\)

\[\begin{split} \begin{align*} Q = Q H &= \frac{1}{3} \begin{pmatrix} -1 & -2 & -2 \\ -2 & 2 & -1 \\ -2 & -1 & 2 \end{pmatrix} \frac{1}{5} \begin{pmatrix} 5 & 0 & 0 \\ 0 & -4 & -3 \\ 0 & -3 & 4 \end{pmatrix} \\ &= \frac{1}{15} \begin{pmatrix} -5 & 14 & -2 \\ -10 & -5 & -10 \\ -10 & -2 & 11 \end{pmatrix}, \\ R = HR &= \frac{1}{5} \begin{pmatrix} 5 & 0 & 0 \\ 0 & -4 & -3 \\ 0 & -3 & 4 \end{pmatrix} \begin{pmatrix} -3 & -2 \\ 0 & 4 \\ 0 & 3 \end{pmatrix} = \begin{pmatrix} -3 & -2 \\ 0 & -5 \\ 0 & 0 \end{pmatrix}. \end{align*} \end{split}\]

Code#

The code below defines a function called qr_householder() that calculates the QR decomposition of an \(m \times n\) matrix \(A\) using Householder transformations. This has been used to calculate the QR decomposition of the matrix from Example 6.9.

Python

def qr_householder(A):
    m, n = A.shape
    I = np.eye(m)
    Q, R = np.eye(m), np.copy(A)
    for j in range(n):
        u = R[:,[j]]
        u[:j] = 0
        u = u + np.sign(R[j,j]) * np.linalg.norm(u) * I[:,[j]]
        v = u / np.linalg.norm(u)
        H = I - 2 * np.dot(v, v.T)
        R = np.dot(H, R)
        Q = np.dot(Q, H)
    
    return Q, R

The NumPy command np.matmul(A, B) calculates the matrix multiplication of two matrices A and B.

MATLAB

function [Q, R] = qr_householder(A)

[m, n] = size(A);
I = eye(m);
Q = I;
R = A;
for j = 1 : n
    u = R(:,j);
    u(1:j-1) = 0;
    u = u + sign(R(j,j)) * norm(u) * I(:,j);
    v = u / norm(u);
    H = I - 2 * v * v';
    Q = Q * H;
    R = H * R;
end

end

QR decomposition

Contents

6.5. QR decomposition#

6.5.1. QR decomposition using the Gram-Schmidt process#

Code#

6.5.2. QR decomposition using Householder transformations#

Code#