3.4 – Matrix Decompositions
Any matrix A can be composed into the product of an upper and unit lower
triangular matrix, A = LU . This decomposition has a number of advantages .
First, we
can solve the linear system, Lx = a and Uy = b in O(n2 ) ope rations using a
relatively
simple algorithm . Second, we can compute the inverse A-1 =U-1L-1 (in those cases
where A in invertible). Third, we can compute the condition number and the
determinant
of the matrix.
First, let us consider the problem of computing the LU decomposition. Notice
that,

We have,

Clearly, we have a relatively simple algorithm for
computing the LU decomposition,

We can rewrite this algorithm in a way that factorizes A
in place . We have,

Notice that this algorithm is O(n3 ) .
In general, this algorithm is not numerically stable. To make this algorithm
more
effective, we can employ pivoting. We consider the decomposition, A = PLU ,
where P
is a permutation matrix.
Special types of matrices admit special types of decompositions. Suppose that A
is symmetric and positive definite . Then there exists a lower triangular matrix
L such
that A = LLT . The algorithm for computing L is called the Cholesky
decomposition. The
Cholesky Decomposition is stable without pivoting.
The Cholesky decomposition algorithm is quite easy to derive as well. Let,

Multiplying L with L T yields the following system of
equations ,

This indicates that we can follow the following process to
determine
.

and so on.
Once we have de termined the Cholesky factor L or a matrix A , we can solve the
linear system Ax = b in O(n2 ) operations. We can write Ax = LLT x = b . Define
y = LT x .
Then Ly = b . We can thus solve the system by first solving Ly = b for y and
then
solving LT x = y for x . Each of these systems can be solved in O(n2 ) .
Now, suppose we want to invert a general or symmetric positive definite matrix.
Computing the inverse then simply requires inverting upper and lower-triangular
matrices. The inverse of as lower triangular matrix will
be lower triangular itself. Let A
be a lower triangular matrix and let B be its inverse. Then we have,

We can solve this system recursively as before.
The final case is the symmetric indefinite matrix. Here, the Bunch-Kaufman
algorithm provides one fast algorithm for matrix decomposition, which is
implemented in
LAPACK. This algorithm is hard to get a hold of, but good implementations
compare in
speed to the Cholesky Decomposition algorithms, are faster that the LU
decomposition
by about a factor of 3.
Figure 3.4 reports the running time for each of these three algorithms on a
number
of large symmetric positive definite matrix. I tested the CLAPACK version of all
three
algorithms. I tested the NR C++ version of the LU and Cholesky decomposition
algorithms, as well as NR’s recommend Guass Jordan algorithm for computing a
matrix
inverse. Finally, I tested LU and Cholesky algorithms that I optimized for my
machine
(which probably won’t be as fast as properly optimized algorithms).
For both the LU and Cholesky decompositions, the un-optimized CLAPACK
implementations are faster that NR C++ code by a factor of 3. My optimized code
is
somewhat faster than the LAPACK implementations. The LU Decompositions takes
about twice as long as the BK and Cholesky Decompositions (which take about an
equal
amount of time). The Guass Jordan inverse is substantially slower.
Table 3.4 – Computation Time for the Inverse of a
Symmetric
Positive Definite Matrix
Algorithm
|
Implementation
|
My Desktop (AMD) |
My Laptop (Intel |
|
n=500 |
n=1,000 |
n=500 |
n=1,000 |
| |
|
|
|
|
|
|
Guass Jordan Elimination |
NR C++ |
5.1 |
39.9 |
7.1 |
56.0 |
| |
|
|
|
|
LU Decomposition |
NR C++ |
2.9 |
23.2 |
4.2 |
33.1 |
|
CLAPACK |
1.0 |
8.8 |
1.8 |
15.0 |
|
My Optimizations |
0.7 |
6.3 |
1.3 |
11.6 |
| |
|
|
|
|
BK Decomposition |
LAPACK |
0.4 |
3.3 |
0.7 |
6.6 |
| |
|
|
|
|
Cholesky Decomposition |
NR C++ |
1.5 |
12.4 |
2.0 |
16.4 |
|
LAPACK |
0.5 |
3.6 |
0.7 |
6.7 |
|
MY Optimizations |
0.3 |
2.8 |
0.5 |
5.8 |
All three matrix decomposition algorithms can be further
specialized. For
example, we can develop versions of the LU, Bunch-Kaufman, and Cholesky
decompositions for band diagonal matrices that requires O(nk2 ) operations
(here, k is
the number of band in the matrix). For block diagonal matrices, we can simply
apply
each of these algorithms for each block at a time. The Cholesky decomposition
can be
specialized to Toeplitz matrices provides an algorithm that requires O(nk)
operations.
The decomposition can be used to perform a number of
operations. The BLAS
implements an O(n2 ) algorithm for solving y = Lx , where L is a lower
triangular matrix.
A similar algorithm applies to an upper triangular matrix. Now, given the LU
decomposition A = LU , we can solve Ax = y by solving Lz = y and then solving
Ux = z . We can compute the determinant of a matrix by computing the product of
the
diagonal terms of L . We can compute the inverse of a matrix and condition
number of a
matrix using related algorithms.
Specialized algorithms exist to factorize a matrix that is subject to a rank-one
update. A rank one update to the matrix A is a matrix of the form A' = A+ uvT .
If we
already have an LU of Cholesky Decomposition of A , we can determine the
decomposition of A' in O(n2 ) operations. This type of update is used
extensively in
optimization and nonlinear equations , where the BFGS and Broyden updates are
rank two
updates.
Sparse matrices have their own special algorithms. Let us let s denote the
number
of non- zero elements in a sparse matrix. The a matrix-vector multiply requires
O(s)
operations. Algorithms for performing the decomposition of sparse matrices fall
into two
classes. The first class of algorithms rely on direct factorization. These
algorithms rely on
the fact that if the matrix is sparse, them it may be possible to generate a
decomposition
that preserves this sparsity . Pivoting is used here are a way to ensure sparsity
of the
decomposition rather than ensure numerical stability. The second class of
algorithms are
iterative in nature. They rely on the fact that the operation y ← Ax is
relatively cheap.