6.4. Cholesky decomposition#

Cholesky decomposition is an efficient matrix decomposition method that can be used when a square matrix is positive definite.

Definition 6.1 (Positive definite matrix)

A square matrix \(A\) is said to be positive definite if

\[ \vec{x}^\mathrm{T} A \vec{x} > 0, \]

for all \(\vec{x} \in \mathbb{R}^n \backslash \{\vec{0}\}\).

Theorem 6.3 (Determinant test for a positive definite matrix)

A square matrix \(A\) is positive definite if it is symmetric and the determinants of all \(k \times k\) upper-left sub-matrices are all positive.

Example 6.5

Use the determinant test to show that the following matrix is positive definite

\[\begin{split} \begin{align*} A = \begin{pmatrix} 2 & -1 & 0 \\ -1 & 2 & -1 \\ 0 & -1 & 2 \end{pmatrix}. \end{align*} \end{split}\]
Solution (click to show)

Matrix \(A\) is symmetric since \(A=A^\mathrm{T}\). Checking the determinants of the upper-left sub-matrices

\[\begin{split} \begin{align*} \det(2) &= 2 > 0, \\ \det\begin{pmatrix} 2 & -1 \\ -1 & 2 \end{pmatrix} &= 4 - 1 = 3 > 0, \\ \det\begin{pmatrix} 2 & -1 & 0 \\ -1 & 2 & -1 \\ 0 & -1 & 2 \end{pmatrix} &= 6 - 2 + 0 = 4 > 0 . \end{align*} \end{split}\]

Since \(A\) is a symmetric matrix and all the determinants of the upper-left sub-matrices are positive then \(A\) is a positive definite matrix.

Given a positive definite matrix \(A\) the Cholesky decomposition decomposes \(A\) into the product of a lower triangular matrix \(L\) and its transpose \(L^\mathrm{T}\), i.e.,

\[ \begin{align*} A = LL^\mathrm{T}. \end{align*} \]

Consider the Cholesky decomposition of a \(3\times 3\) matrix

\[\begin{split} \begin{align*} \begin{pmatrix} a_{11} & a_{12} & a_{13} \\ a_{21} & a_{22} & a_{23} \\ a_{31} & a_{32} & a_{33} \end{pmatrix} &= \begin{pmatrix} \ell_{11} & 0 & 0 \\ \ell_{21} & \ell_{22} & 0 \\ \ell_{31} & \ell_{32} & \ell_{33} \end{pmatrix} \begin{pmatrix} \ell_{11} & \ell_{21} & \ell_{31} \\ 0 & \ell_{22} & \ell_{32} \\ 0 & 0 & \ell_{33} \end{pmatrix}\\ &= \begin{pmatrix} \ell_{11}^2 & \ell_{11} \ell_{21} & \ell_{11} \ell_{31} \\ \ell_{11} \ell_{21} & \ell_{21}^2 +\ell_{22}^2 & \ell_{21} \ell_{31} +\ell_{22} \ell_{33} \\ \ell_{11} \ell_{31} & \ell_{21} \ell_{31} +\ell_{22} \ell_{33} & \ell_{31}^2 +\ell_{32}^2 +\ell_{33}^2 \end{pmatrix}. \end{align*} \end{split}\]

The elements on the main diagonal are

\[ \begin{align*} a_{jj} =\ell_{jj}^2 +\sum_{k=1}^{j-1} \ell_{jk}^2 , \end{align*} \]

and the other elements are

\[ \begin{align*} a_{ij} =\sum_{k=1}^j \ell_{ik} \ell_{jk} = \ell_{jj} \ell_{ij} +\sum_{k=1}^{j-1} \ell_{ik} \ell_{jk}. \end{align*} \]

Rearranging these expressions gives the following definition

Theorem 6.4 (Cholesky decomposition)

The Cholesky decomposition of an \(n \times n\) positive definite matrix \(A\) results in an \(n \times n\) matrix \(L\) such that \(A = LL^\mathrm{T}\). The elements of \(L\) are calculated using

(6.2)#\[\begin{split} \begin{align} \ell_{jj} &= \sqrt{a_{jj} - \sum_{k=1}^{j-1} \ell_{jk}^2 }, & j &= 1, \ldots, n,\\ \ell_{ij} &= \dfrac{1}{\ell_{jj} }\left(a_{ij} -\displaystyle \sum_{k=1}^{j-1} \ell_{ik} \ell_{jk} \right), & i &= j + 1,\ldots ,n. \end{align} \end{split}\]

Algorithm 6.5 (Cholesky decomposition)

Inputs: An \(n \times n\) positive definite matrix \(A\).

Outputs: An \(n \times n\) lower triangular matrix \(L\) such that \(A = LL^\mathrm{T}\).

  • \(L \gets \vec{0}_{n \times n}\)

  • For \(j = 1, \ldots n\) do

    • \(\ell_{jj} \gets \sqrt{a_{jj} - \displaystyle\sum_{k=1}^{j-1} \ell_{jk}^2}\)

    • For \(i = j + 1, \ldots n\) do

      • \(\ell_{ij} \gets \dfrac{1}{\ell_{jj} }\left(a_{ij} -\displaystyle \sum_{k=1}^{j-1} \ell_{ik} \ell_{jk} \right)\)

  • Return \(L\)

Example 6.6

Calculate the Cholesky decomposition of the following matrix

\[\begin{split} \begin{align*} A = \begin{pmatrix} 4 & -2 & -4\\ -2 & 10 & 5\\ -4 & 5 & 14 \end{pmatrix}. \end{align*} \end{split}\]
Solution (click to show)

Stepping through the columns of \(A\) and using equation (6.2)

\[\begin{split} \begin{align*} j &= 1: & \ell_{11} &= \sqrt{a_{11}} = \sqrt{4} = 2,\\ && \ell_{21} &= \frac{1}{\ell_{11}}(a_{21}) = \frac{1}{2}(-2) = -1,\\ && \ell_{31} &= \frac{1}{\ell_{11}}(a_{31}) = \frac{1}{2}(-4) = -2,\\ \\ j &= 2: & \ell_{22} &= \sqrt{a_{22} - \ell_{21}^2} = \sqrt{10 - (-1)^2} = \sqrt{9} = 3,\\ && \ell_{32} &= \frac{1}{\ell_{22}}(a_{32} - \ell_{31} \ell_{21}) = \frac{1}{3}(5 - (-2)(-1)) = 1,\\ \\ j &= 3: & \ell_{33} &= \sqrt{a_{33} - \ell_{31}^2 - \ell_{32}^2} = \sqrt{14 - (-2)^2 - 1^2} = \sqrt{9} = 3, \end{align*} \end{split}\]

therefore

\(L=\begin{pmatrix} 2 & 0 & 0 \\ -1 & 3 & 0 \\ -2 & 1 & 3 \end{pmatrix}\). Checking that \(A = LL^\mathrm{T}\)

\[\begin{split} \begin{align*} \begin{pmatrix} 2 & 0 & 0\\ -1 & 3 & 0\\ -2 & 1 & 3 \end{pmatrix} \begin{pmatrix} 2 & -1 & -2\\ 0 & 3 & 1\\ 0 & 0 & 3 \end{pmatrix} = \begin{pmatrix} 4 & -2 & -4\\ -2 & 10 & 5\\ -4 & 5 & 14 \end{pmatrix}. \end{align*} \end{split}\]

6.4.1. Code#

The code below defines a function called lu() that calculates the Cholesky decomposition of a positive-definite matrix. This is used to calculate the Cholesky decompostiion of the matrix from Example 6.6.

def cholesky(A):
    n = A.shape[0]
    L = np.zeros((n, n))   
    for j in range(n):
        for i in range(j, n):
            sum_ = 0
            for k in range(j):
                sum_ += L[i,k] * L[j,k]
                
            if i == j:
                L[i,j] = np.sqrt(A[i,j] - sum_)
            else:
                L[i,j] = (A[i,j] - L[i,j]) / sum_
    
    return L
function L = cholesky(A)

n = size(A, 1);
L = zeros(n);
for j = 1 : n
    for i = j : n
    sum_ = 0;
        for k = 1 : j - 1
            sum_ = sum_ + L(i,k) * L(j,k);
        end
        if i == j
            L(i,j) = sqrt(A(i,j) - sum_);
        else
            L(i,j) = (A(i,j) - L(i,j)) / sum_;
        end
    end
end

end

6.4.2. Solving systems of linear equations using Cholesky decomposition#

Given a system of linear equations of the form \(A \vec{x} = \vec{b}\) where Cholesky decomposition has been applied to the coefficient matrix then since \(A = LL^\mathrm{T}\) we have

\[ LL^\mathrm{T} \vec{x} = \vec{b}.\]

Let \(\vec{y} = L^\mathrm{T} \vec{x}\) then

\[ L \vec{y} = \vec{b}. \]

Similar to solving systems using LU decomposition we solve \(L \vec{y} = \vec{b}\) for \(\vec{y}\) using forward substitution and then \(L^\mathrm{T} \vec{x} = \vec{y}\) using back substitution.

Algorithm 6.6 (Solving systems of linear equations using Cholesky decomposition)

Inputs: A system of linear equations expressed using the coefficient matrix \(A\) and variable vector \(\vec{b}\) such that \(A \vec{x} = \vec{b}\).

Inputs: The variable vector \(\vec{x}\).

  • Calculate the Cholesky decomposition of \(A\) to determine \(L\) such that \(A = LL^\mathrm{T}\).

  • For \(i = 1, \ldots, n\) do

    • \(y_i \gets b_i - \displaystyle\sum_{j=1}^i \ell_{ij} y_j\)

  • \(x_n \gets \dfrac{y_i}{\ell_{nn}}\)

  • For \(i = n - 1, \ldots, 1\) do

    • \(x_i \gets \dfrac{1}{\ell_{ii}} \left( y_i - \displaystyle \sum_{j=i+1}^{n} \ell_{ji} x_j \right)\)

  • Return \(\vec{x} = (x_1, x_2, \ldots, x_n)\).

Example 6.7

Solve the following system of linear equations using the Cholesky-Crout method.

\[\begin{split} \begin{align*} \begin{pmatrix} 4 & -2 & -4\\ -2 & 10 & 5\\ -4 & 5 & 14 \end{pmatrix} \begin{pmatrix} x_1 \\ x_2 \\ x_3 \end{pmatrix} = \begin{pmatrix} -2 \\ 49 \\ 27 \end{pmatrix}. \end{align*} \end{split}\]
Solution (click to show)

We saw in Example 6.6 that the Cholesky decomposition of the matrix \(A\) is

\[\begin{split} L=\begin{pmatrix} 2 & 0 & 0 \\ -1 & 3 & 0 \\ -2 & 1 & 3 \end{pmatrix}. \end{split}\]

Solving \(L\vec{y}=\vec{b}\) using forward substitution

\[\begin{split} \begin{align*} \begin{pmatrix} 2 & 0 & 0\\ -1 & 3 & 0\\ -2 & 1 & 3 \end{pmatrix} \begin{pmatrix} y_1 \\ y_2 \\ y_3 \end{pmatrix} = \begin{pmatrix} -2 \\ 49 \\ 27 \end{pmatrix}, \end{align*} \end{split}\]

gives

\[\begin{split} \begin{align*} y_1 &=\frac{-2}{2}=-1,\\ y_2 &=\frac{1}{3}(49+y_1 )=\frac{1}{3}(49-1)=16,\\ y_3 &=\frac{1}{3}(27+2y_1 -y_2 )=\frac{1}{3}(27+2(-1)-16)=3. \end{align*} \end{split}\]

Solving \(L^\mathrm{T} \vec{x}=\vec{y}\) using back substitution

\[\begin{split} \begin{align*} \begin{pmatrix} 2 & -1 & -2\\ 0 & 3 & 1\\ 0 & 0 & 3 \end{pmatrix} \begin{pmatrix} x_1 \\ x_2 \\ x_3 \end{pmatrix} = \begin{pmatrix} -1 \\ 16 \\ 3 \end{pmatrix}, \end{align*} \end{split}\]

gives

\[\begin{split} \begin{align*} x_3 &=\frac{3}{3}=1,\\ x_2 &=\frac{1}{3}(16-x_3 )=\frac{1}{3}(16-1)=5,\\ x_1 &=\frac{1}{2}(-1+x_2 +2x_3 )=\frac{1}{2}(-1+5+2(1))=3. \end{align*} \end{split}\]

So the solution is \(\vec{x}=(3,5,1)\).

6.4.3. Code#

The code below calculates the solution of the system of linear equations from Example 6.7 using Cholesky decomposition.

# Define system
A = np.array([[4, -2, -4], 
              [-2, 10, 5], 
              [-4, 5, 14]])
b = np.array([-2, 49, 27])

# Calculate Cholesky decomposition
L = cholesky(A)
print(f"L = \n{L}\n")

# Solve system
y = forward_substitution(L, b)
x = back_substitution(L.T, y)
print(f"x = {x}")
% Define system
A = [4, -2, -4 ;
     -2, 10, 5 ; 
     -4, 5, 14];
b = [-2 ; 49 ; 27];

% Calculate cholesky decomposition
L = cholesky(A);

% Solve system
y = forward_substitution(L, b);
x = back_substitution(L', y)