6.5. QR decomposition#
QR decomposition is another decomposition method that factorises a matrix in the product of two matrices \(Q\) and \(R\) such that
where \(Q\) is an orthogonal matrix and \(R\) is an upper triangular matrix.
Definition 6.2 (Orthogonal vectors)
A set of vectors \(\lbrace \mathbf{v}_1 ,\mathbf{v}_2 ,\mathbf{v}_3 ,\dots \rbrace\) is said to be orthogonal if \(\mathbf{v}_i \cdot \mathbf{v}_j =0\) for \(i\not= j\). Furthermore the set is said to be orthonormal if \(\mathbf{v}_i\) are all unit vectors.
Definition 6.3 (Orthogonal matrix)
An orthogonal matrix is a matrix where the columns are a set of orthonormal vectors. If \(A\) is an orthogonal matrix if
For example consider the matrix
This is an orthogonal matrix since
Futhermore it is an orthonormal matrix since the magnitude of the columns for \(A\) are both 1.
6.5.1. QR decomposition using the Gram-Schmidt process#
Consider the \(3 \times 3\) matrix \(A\) represented as the concatenation of the column vectors \(\mathbf{a}_1, \mathbf{a}_2, \mathbf{a}_3\)
where \(\mathbf{a}_j = (a_{1j}, a_{2j}, a_{3j})\). To calculate the orthogonal \(Q\) matrix we need a set of vectors \(\{\mathbf{q}_1, \mathbf{q}_2, \mathbf{q}_3\}\) that span the same space as \(\{\mathbf{a}_1, \mathbf{a}_2, \mathbf{a}_3\}\) but are orthogonal to each other. One way to do this is using the Gram-Schmidt process. Consider the diagram shown in Fig. 6.1 that shows the first two non-orthogonal vectors \(\mathbf{a}_1\) and \(\mathbf{a}_2\).
Fig. 6.1 The Gram-Schmidt process#
We begin by letting \(\mathbf{u}_1 = \mathbf{a}_1\) and seek to find a vector \(\mathbf{u}_2\) that is orthogonal to \(\mathbf{u}_1\) and in the same span as \(\mathbf{a}_2\). To do this we subtract the vector projection of \(\mathbf{a}_2\) onto \(\mathbf{u}_1\), i.e.,
We can simplify things by normalising \(\mathbf{u}_1\) such that \(\mathbf{q}_1 = \dfrac{\mathbf{u}_1}{\|\mathbf{u}_1\|}\) so we have
which we also normalise to give \(\mathbf{q}_2 = \dfrac{\mathbf{u}_2}{\|\mathbf{u}_2\|}\). For the next vector \(\mathbf{a}_3\) we want the vector \(\mathbf{u}_3\) to be orthogonal to both \(\mathbf{q}_1\) and \(\mathbf{q}_2\) so we subtract both the vector projections of \(\mathbf{a}_3\) onto \(\mathbf{q}_1\) and \(\mathbf{q}_2\)
which is normalised to give \(\mathbf{q}_3\). Continuing in this way we see that for the \(i\)-th orthogonal vector we have
The columns of the \(Q\) matrix are formed using the vectors \(\mathbf{q}_i\)
The \(R\) matrix is found by rearranging \(A = QR\). Since \(Q\) is an orthogonal matrix, \(Q^\mathsf{T} = Q^{-1}\) so
Positive diagonal elements of \(R\)#
When we compute the QR decomposition of a matrix we do not get a unique factorisation. Consider \(A = QR\) then
for any diagonal matrix \(D\) with entries \(\pm 1\), i.e., multiplying a column of \(Q\) by -1 and the corresponding row of \(R\) by -1 doesn’t change the product \(QR\). It has become standard convention to impose a condition that the diagonal elements of \(R\) must be positive. This is achieved by computing \(D\) where the diagonal elements are \(D_{ii} = \operatorname{sign}(R_{ii})\) and calculating \(R = DR\) and \(Q = QD\).
Algorithm 6.7 (QR decomposition using the Gram-Schmidt process)
Inputs: An \(m \times n\) matrix \(A\).
Outputs: An \(m \times n\) orthogonal matrix \(Q\) and and \(n \times n\) upper triangular matrix \(R\) such that \(A = QR\).
For \(i = 1, \ldots, n\) do
\(\mathbf{u}_i \gets \mathbf{a}_i - \displaystyle\sum_{j=1}^{i-1} (\mathbf{a}_i \cdot \mathbf{q}_j) \mathbf{q}_j\)
\(\mathbf{q}_i \gets \dfrac{\mathbf{u}_i}{\| \mathbf{u}_i \|}\)
\(Q \gets (\mathbf{q}_1, \ldots, \mathbf{q}_n)\)
\(R \gets Q^{\mathsf{T}}A\)
\(D \gets I_n\)
\(D_{ii} \gets \operatorname{sign}(R_{ii})\) for \(i = 1 \ldots n\)
\(R \gets DR\)
\(Q \gets QD\)
Return \(Q\) and \(R\).
Example 6.8
Calculate the QR decomposition of the following matrix using the Gram-Schmidt process
Solution
Therefore
All diagonal elements of \(R\) are positive so we don’t need to calculate \(D\).
Code#
The code below defines a function called qr_gramschmidt()
which calculates the QR decomposition of a matrix A
using the Gram-Schmidt process.
def qr_gramschmidt(A):
m, n = A.shape
Q = np.copy(A).astype(float)
for i in range(n):
for j in range(i):
Q[:,i] -= np.dot(A[:,i], Q[:,j]) * Q[:,j]
Q[:,i] /= np.linalg.norm(Q[:,i])
R = np.dot(Q.T, A)
D = np.diag(np.sign(np.diag(R)))
R = np.dot(D, R)
Q = np.dot(Q, D)
return Q, R
function [Q, R] = qr_gramschmidt(A)
n = size(A, 2);
Q = A;
for i = 1 : n
for j = 1 : i - 1
Q(:,i) = Q(:,i) - dot(A(:,i), Q(:,j)) * Q(:,j);
end
Q(:,i) = Q(:,i) / norm(Q(:,i));
end
R = Q' * A;
end
6.5.2. QR decomposition using Householder reflections#
Another method we can use to calculate the QR decomposition of a matrix is by the use of Householder reflections.
Definition 6.4 (Householder matrix)
The Householder matrix is an orthogonal reflection matrix of the form
where \(\mathbf{v}\) is a non-zero vector. The matrix \(H\) is symmetric and orthogonal.
Fig. 6.2 The vector \(\mathbf{x}\) is reflected about the dashed line so that it is parallel to the basis vector \(\mathbf{e}_1\).#
Consider the diagram in Fig. 6.2 where the vector \(\mathbf{x}\) is reflected about the hyperplane represented by the dashed line so that it is parallel to the basis vector \(\mathbf{e}_1 = (1, 0, \ldots, 0)^\mathsf{T}\). The Householder matrix that achieves this reflection is based on the vector \(\mathbf{v}\) which is orthogonal to the dashed line and calculated using
To compute the QR decomposition of a matrix \(A\), we reflect the first column of \(A\), using Householder reflection so that it is parallel to \(\mathbf{e}_1\). This means the only non-zero element in the first row.
We then look to do the same for the sub-matrix of \(R\) formed by omitted the first row and column. We do this by calculating a \((m-1) \times (m-1)\) Householder matrix \(H'\) using \(\mathbf{x} = (a_{22}, a_{32}, \ldots, a_{m2})^\mathsf{T}\). This is used to form the \(n \times n\) Householder matrix \(H_2\) by padding out the first row and column of the identity matrix.
The Householder matrix \(H_2\) is then applied to \(R\) so that the second column barring the first row is parallel to \(\mathbf{e}_1\).
We repeat this process for all columns in \(R\) which results in a triangular matrix. If \(A\) has \(n\) columns then
rearranging gives
The Householder matrices are symmetric and orthogonal so \(H^{-1} = H\) and since we want \(A = QR\) then
Algorithm 6.8 (QR decomposition using Householder reflections)
Inputs: An \(m \times n\) matrix \(A\).
Outputs: An \(n \times n\) orthogonal matrix \(Q\) and and \(n \times n\) upper triangular matrix \(R\) such that \(A = QR\).
\(Q \gets I_m\)
\(R \gets A\)
For \(i = 1, \ldots, n\) do
\(\mathbf{x} \gets\) column \(i\) of \(R\) starting at the \(i\)-th row
\(\mathbf{v} \gets \mathbf{x} + \|\mathbf{x}\|\mathbf{e}_1\)
\(H \gets I_m - 2 \dfrac{\mathbf{v} \mathbf{v} ^\mathsf{T}}{\mathbf{v} \cdot \mathbf{v}}\)
\(R \gets H R\)
\(Q \gets Q H\)
\(D \gets I_{m}\)
\(D_{ii} \gets \operatorname{sign}(R_{ii})\) for \(i = 1 \ldots n\)
\(R \gets DR\)
\(Q \gets QD\)
Return \(Q\) and \(R\)
Example 6.9
Calculate the QR decomposition of the following matrix using Householder reflections
Solution
Set \(Q = I\) and \(R = A\). We want to zero out the elements below \(R_{11}\) so \(\mathbf{x} = (1, 2, 2)^\mathsf{T}\) and \(\|\mathbf{x}\| = 3\).
We want to zero out the elements below \(R_{22}\) so \(\mathbf{x} = (4, 3)^\mathsf{T}\) and \(\| \mathbf{x} \| = 5\)
To make the diagonal elements in \(R\) positive we multiply \(R\) on the left by \(D = \operatorname{diag}(-1, -1, 1)\) and \(Q\) on the right by \(D\).
Code#
The code below defines a function called qr_householder()
that calculates the QR decomposition of an \(m \times n\) matrix \(A\) using Householder reflections. This has been used to calculate the QR decomposition of the matrix from Example 6.9.
def qr_householder(A):
m, n = A.shape
Q, R = np.eye(m), np.copy(A)
for i in range(n):
x = R[i:,i]
v = np.array([x + np.sign(x[0]) * np.linalg.norm(x) * np.eye(m-i)[:,0]]).T
H = np.eye(m)
H[i:,i:] -= 2 * np.dot(v, v.T) / np.dot(v.T, v)
R = np.dot(H, R)
Q = np.dot(Q, H)
D = np.eye(m)
D[:n-m,:n-m] = np.diag(np.sign(np.diag(R)))
R = np.dot(D, R)
Q = np.dot(Q, D)
return Q, R
function [Q, R] = qr_householder(A)
[m, n] = size(A);
Q = eye(m);
R = A;
D = eye(m);
for i = 1 : n
x = R(i:end,i);
v = x + sign(x(1)) * norm(x) * [1 ; zeros(m-i,1)];
H = eye(m);
H(i:end,i:end) = H(i:end,i:end) - 2 * v * v' / dot(v, v);
R = H * R;
Q = Q * H;
D(i,i) = sign(R(i,i));
end
D = eye(m);
D(1:m-(m-n),1:m-(m-n)) = diag(sign(diag(R)));
R = D * R;
Q = Q * D;
end
6.5.3. Loss of orthogonality#
Like any numerical technique, the QR decomposition methods shown here are prone to computational rounding errors. If \(Q\) is perfectly orthogonal then \(Q^\mathsf{T}Q = I\), but if there is bound to be some small error. If we let this error be \(E\) then \(Q^\mathsf{T}Q = I + E\). Rearranging gives
which is known as the Frobenius norm and is a measure of the loss of orthogonality.
The both QR decomposition methods shown here use an iterative approach where we orthoganlise the columns of the \(A\) matrix in turn based upon the previous iterations. So an error in column \(i\) will have a knock on effect on all subsequent columns. To anaylse the methods, the QR decomposition of a random \(100 \times 100\) matrix \(A\) was computed using the Gram-Schmidt process and Householder reflections. The Frobenius norm was calculated for the first \(n\) columns of \(Q\) where \(n = 1 \ldots 100\) giving the loss of orthogonality for those iterations and plotted in below.

Here we can see that computing QR decomposition using Householder reflections is less prone to orthogonality errors than using the Gram-Schmidt process.