6.5. QR decomposition#
QR decomposition is another decomposition method that factorises a matrix in the product of two matrices \(Q\) and \(R\) such that
where \(Q\) is an orthogonal matrix and \(R\) is an upper triangular matrix.
(Orthogonal vectors)
A set of vectors \(\lbrace \vec{v}_1 ,\vec{v}_2 ,\vec{v}_3 ,\dots \rbrace\) is said to be orthogonal if \(\vec{v}_i \cdot \vec{v}_j =0\) for \(i\not= j\). Furthermore the set is said to be orthonormal if \(\vec{v}_i\) are all unit vectors.
(Orthogonal matrix)
An orthogonal matrix is a matrix where the columns are a set of orthonormal vectors. If \(A\) is an orthogonal matrix if
For example consider the matrix
This is an orthogonal matrix since
Futhermore it is an orthonormal matrix since the magnitude of the columns for \(A\) are both 1.
6.5.1. QR decomposition using the Gram-Schmidt process#
Consider the \(3 \times 3\) matrix \(A\) represented as the concatenation of the column vectors \(\vec{a}_1, \vec{a}_2, \vec{a}_3\)
where \(\vec{a}_i = (a_{1i}, a_{2i}, a_{3i})^\mathrm{T}\). To calculate the orthogonal \(Q\) matrix we need a set of vectors \(\{\vec{q}_1, \vec{q}_2, \vec{q}_3\}\) that span the same space as {\(\vec{a}_1, \vec{a}_2, \vec{a}_3\}\) but are orthogonal to each other. One way to do this is using the Gram-Schmidt process. Consider the diagram shown in Fig. 6.1 that shows the first two non-orthogonal vectors \(\vec{a}_1\) and \(\vec{a}_2\).
We begin by letting \(\vec{u}_1 = \vec{a}_1\) and seek to find a vector \(\vec{u}_2\) that is orthogonal to \(\vec{u}_1\) and in the same span as \(\vec{a}_2\). To do this we subtract the vector projection of \(\vec{a}_2\) onto \(\vec{u}_1\), i.e.,
We can simplify things by normalising \(\vec{u}_1\) such that \(\vec{q}_1 = \vec{u}_1 / \\\\|\vec{u}_1\\|\) so we have
which is also normalised to give \(\vec{q}_2 = \vec{u}_2 / \|\vec{u}_2\|\). For the next vector \(\vec{a}_3\) we want the vector \(\vec{u}_3\) to be orthogonal to both \(\vec{q}_1\) and \(\vec{q}_2\) so we subtract both the vector projections of \(\vec{a}_3\) onto \(\vec{q}_1\) and \(\vec{q}_2\)
which is normalised to give \(\vec{q}_3\). Rerranging the expressions for \(\vec{u}_1\), \(\vec{u}_2\) and \(\vec{u}_3\) gives
and since \(\vec{u}_i = (\vec{a}_i \cdot \vec{q}_i) \vec{q}_i\) (the projection of \(\vec{u}_i\) onto a unit vector pointing in the same direction at \(\vec{u}_i\)) then
The matrix \(A\) can be written as the product of two matrices
where the matrix on the left is an orthogonal matrix and the matrix on the right is an upper triangular matrix. Let \(Q\) denote the orthonormal matrix and \(R\) denote the upper triangular matrix then \(A = QR\).
(QR decomposition using the Gram-Schmidt process)
Inputs: An \(m \times n\) matrix \(A\).
Outputs: An \(m \times n\) orthogonal matrix \(Q\) and and \(n \times n\) upper triangular matrix \(R\) such that \(A = QR\).
\(R \gets \vec{0}_{n\times n}\)
For \(j = 1, \ldots, n\) do
For \(i = 1, \ldots, j - 1\) do
\(r_{ij} \gets \vec{q}_i \cdot \vec{a}_j\)
\(\vec{u}_j \gets \vec{a}_j - \displaystyle\sum_{i=1}^{j-1} r_{ij} \vec{q}_i\)
\(r_{jj} \gets \|\vec{u}_j\|\)
\(\vec{q}_j \gets \dfrac{\vec{u}_j}{r_{jj}}\)
Return \(Q = (\vec{q}_1, \vec{q}_2, \ldots , \vec{q}_n)\) and \(R\).
Calculate the QR decomposition of the following matrix using the Gram-Schmidt process
Solution (click to show)
Column \(j=1\):
Column \(j = 2\):
Column \(j = 3\):
Therefore
Code#
The code below defines a function called qr_gramschmidt()
which calculates the QR decomposition of a matrix A
using the Gram-Schmidt process.
def qr_gramschmidt(A):
n = A.shape[1]
Q, R = np.zeros(A.shape), np.zeros((n,n))
for j in range(n):
sum_ = 0
for i in range(j):
R[i,j] = np.dot(Q[:,i], A[:,j])
sum_ += R[i,j] * Q[:,i]
u = A[:,j] - sum_
R[j,j] = np.linalg.norm(u)
Q[:,j] = u / R[j,j]
return Q, R
The Numpy command np.linalg.norm()
calculates the magnitude of a vector.
function [Q, R] = qr_gramschmidt(A)
n = size(A, 2);
Q = zeros(size(A));
R = zeros(n);
for j = 1 : n
sum_ = 0;
for i = 1 : j - 1
R(i,j) = dot(Q(:,i), A(:,j));
sum_ = sum_ + R(i,j) * Q(:,i);
end
u = A(:,j) - sum_;
R(j,j) = norm(u);
Q(:,j) = u / R(j,j);
end
end
6.5.2. QR decomposition using Householder transformations#
Another method we can use to calculate the QR decomposition of a matrix is by the use of Householder transformations. Consider the diagram in Fig. 6.2 where the vector \(\vec{x}\) is reflected about the hyperplane represented by the dashed line so that it is parallel to the basis vector \(\vec{e}_1 = (1, 0, \ldots, 0)^\mathrm{T}\).
The Householder matrix is an orthogonal matrix that performs the Householder transformation reflection
(Householder matrix)
where
If \(\vec{x}\) is nearly parallel to \(\vec{e}_1\) then the denominator in equation (6.3) is close to zero and computational rounding errors can occur. However, we can choose to reflect \(\vec{x}\) so that it is parallel to \(-\vec{e}_1\) instead.
In Fig. 6.3 above, \(\vec{x}\) is to be transformed to \(-\|\vec{x}\|\vec{e}_1\) then
so in order to always reflect in the direction that gives the largest value of \(\|\vec{u}\|\) we use
where
To calculate the QR decomposition of an \(m \times n\) matrix \(A\) we let \(Q = I_m\) and \(R = A = (\vec{r}_1, \vec{r}_2, \ldots , \vec{r}_n)\) and use Householder transformations to transform each of the column vectors \(\vec{r}_1\) so that they are parallel to the basis vector \(\vec{e}_j\). To do this we first calculate \(\vec{v}\) using
and then we calculate the Householder matrix \(H\) using equation (6.3)
and apply the Householder transformation to \(Q\) and \(R\)
The first column of \(R\), \(\vec{r}_1\), is now parallel to \(\vec{e}_1\). We now need to transform the second column of \(R\), \(\vec{r}_2\), without changing the first row and column so that it is parallel to the second basis vector \(\vec{e}_2\). So we set the first element of \(\vec{r}_2\) to zero and calculate the Householder matrix
We repeat this process for the columns in \(R\) at which point \(R\) is an upper triangular matrix. The \(Q\) matrix is the product of all of the individual Householder transformations so by definition it is an orthogonal matrix.
(QR decomposition using Householder transformations)
Inputs: An \(m \times n\) matrix \(A\).
Outputs: An \(n \times n\) orthogonal matrix \(Q\) and and \(n \times n\) upper triangular matrix \(R\) such that \(A = QR\).
\(Q \gets I_m\)
\(R \gets A\)
For \(j = 1, \ldots, n\) do
\(\vec{u} \gets\) column \(j\) of \(R\) with the first \(j-1\) elements set to zero
\(\vec{u} \gets \vec{u} + \operatorname{sign}(r_{jj})\|\vec{u}\|\vec{e}_j\)
\(\vec{v} \gets \dfrac{\vec{u}}{\| \vec{u} \|}\)
\(H \gets I_m - 2 \vec{v} \vec{v} ^\mathrm{T}\)
\(Q \gets Q H\)
\(R \gets H R\)
Calculate the QR decomposition of the following matrix using the Householder transformations
Solution (click to show)
Let \(Q=I_3\) and \(R=A\) and calculate Householder matrix for the first column of \(R\)
Column \(j = 1\): calculate the Householder matrix
Apply Householder transformations to \(Q\) and \(R\)
Column \(j = 2\): calculate Householder matrix
Applying the Householder transformations to \(Q\) and \(R\)
Code#
The code below defines a function called qr_householder()
that calculates the QR decomposition of an \(m \times n\) matrix \(A\) using Householder transformations. This has been used to calculate the QR decomposition of the matrix from Example 6.9.
def qr_householder(A):
m, n = A.shape
I = np.eye(m)
Q, R = np.eye(m), np.copy(A)
for j in range(n):
u = R[:,[j]]
u[:j] = 0
u = u + np.sign(R[j,j]) * np.linalg.norm(u) * I[:,[j]]
v = u / np.linalg.norm(u)
H = I - 2 * np.dot(v, v.T)
R = np.dot(H, R)
Q = np.dot(Q, H)
return Q, R
The NumPy command np.matmul(A, B)
calculates the matrix multiplication of two matrices A
and B
.
function [Q, R] = qr_householder(A)
[m, n] = size(A);
I = eye(m);
Q = I;
R = A;
for j = 1 : n
u = R(:,j);
u(1:j-1) = 0;
u = u + sign(R(j,j)) * norm(u) * I(:,j);
v = u / norm(u);
H = I - 2 * v * v';
Q = Q * H;
R = H * R;
end
end