Subsection 3.4.2 Finding an \(LU\) factorization by the Multiplier Method
In the following example we demonstrate a method for computing an \(LU\) factorization known as the multiplier method.
Example 3.4.7.
Find an \(LU\) factorization for
\begin{equation*}
\begin{bmatrix}
1 \amp 2 \amp 3 \\
2 \amp 3 \amp 1 \\
-2 \amp 3 \amp -2
\end{bmatrix}.
\end{equation*}
Answer.
Write the matrix as the following product.
\begin{equation*}
\begin{bmatrix}
1 \amp 0 \amp 0 \\
0 \amp 1 \amp 0 \\
0 \amp 0 \amp 1
\end{bmatrix}
\begin{bmatrix}
1 \amp 2 \amp 3 \\
2 \amp 3 \amp 1 \\
-2 \amp 3 \amp -2
\end{bmatrix}.
\end{equation*}
In the matrix on the right, we begin with the left row and zero out the entries below the top using the row operation which involves adding a multiple of a row to another row. You do this and also update the matrix on the left so that the product will be unchanged.
Here is the first step. We take \(-2\) times the top row and add to the second. Then take \(2\) times the top row and add to the second in the matrix on the left. This gives us
\begin{equation*}
\begin{bmatrix}
1 \amp 0 \amp 0 \\
2 \amp 1 \amp 0 \\
0 \amp 0 \amp 1
\end{bmatrix}
\begin{bmatrix}
1 \amp 2 \amp 3 \\
0 \amp -1 \amp -5 \\
-2 \amp 3 \amp -2
\end{bmatrix}.
\end{equation*}
The next step is to take \(2\) times the top row and add to the bottom in the matrix on the right. To ensure that the product is unchanged, we place a \(-2\) in the bottom left corner in the matrix on the left. Thus the second step yields
\begin{equation*}
\begin{bmatrix}
1 \amp 0 \amp 0 \\
2 \amp 1 \amp 0 \\
-2 \amp 0 \amp 1
\end{bmatrix}
\begin{bmatrix}
1 \amp 2 \amp 3 \\
0 \amp -1 \amp -5 \\
0 \amp 7 \amp 4
\end{bmatrix}.
\end{equation*}
For our final step, we take \(7\) times the middle row on right and add to bottom row. Updating the matrix on the left in the manner we did earlier, we get
\begin{equation*}
\begin{bmatrix}
1 \amp 0 \amp 0 \\
2 \amp 1 \amp 0 \\
-2 \amp -7 \amp 1
\end{bmatrix}
\begin{bmatrix}
1 \amp 2 \amp 3 \\
0 \amp -1 \amp -5 \\
0 \amp 0 \amp -31
\end{bmatrix}.
\end{equation*}
At this point, we can stop. We have an \(LU\) factorization of the original matrix. We can always multiply the matrices to check, if we wish.
Notice that when we perform the multiplier method, we are making repeated use of a certain type of elementary row operation, namely, we are adding a scalar multiple of one row to a row below it. The reason this helps to create an \(LU\) factorization depends upon the fact that the elementary matrices corresponding to such operations are lower triangular. To understand how this works, we begin with a lemma.
Lemma 3.4.8.
Let \(L\) be a lower (upper) triangular matrix \(m\times m\) which has ones down the main diagonal. Then \(L^{-1}\) also is a lower (upper) triangular matrix which has ones down the main diagonal. In the case that \(L\) is of the form
\begin{equation}
L=
\begin{bmatrix}
1 \amp \amp \amp \\
a_{1} \amp 1 \amp \amp \\
\vdots \amp \amp \ddots \amp \\
a_{n} \amp \amp \amp 1
\end{bmatrix}, \tag{3.4.2}
\end{equation}
where all entries are zero except for the left column and main diagonal, it is also the case that \(L^{-1}\) is obtained from \(L\) by simply multiplying each entry below the main diagonal in \(L\) by \(-1\text{.}\) The same is true if the single nonzero column is in another position.
Proof.
Consider the usual setup for finding the inverse
\(\begin{bmatrix}L \ |\ I\end{bmatrix}\text{.}\) Each row operation used on
\(L\) to transform this matrix to reduced row echelon form changes only the entries in
\(I\) below the main diagonal. Whether we have the special case of
\(L\) given in
(3.4.2) where the nonzero nondiagonal entries are in the left column, or if the single nonzero column is in another position, it is clear that multiplication by
\(-1\) as described in the lemma gives us
\(L^{-1}\text{.}\)
For a simple illustration of the lemma, observe:
\begin{equation*}
\left[\begin{array}{ccc|ccc}
1 \amp 0 \amp 0 \amp 1 \amp 0 \amp 0 \\
0 \amp 1 \amp 0 \amp 0 \amp 1 \amp 0 \\
0 \amp a \amp 1 \amp 0 \amp 0 \amp 1
\end{array}\right]
\rightsquigarrow
\left[\begin{array}{ccc|ccc}
1 \amp 0 \amp 0 \amp 1 \amp 0 \amp 0 \\
0 \amp 1 \amp 0 \amp 0 \amp 1 \amp 0 \\
0 \amp 0 \amp 1 \amp 0 \amp -a \amp 1
\end{array}\right].
\end{equation*}
Now let \(A\) be an \(m\times n\) matrix, say
\begin{equation*}
A=
\begin{bmatrix}
a_{11} \amp a_{12} \amp \cdots \amp a_{1n} \\
a_{21} \amp a_{22} \amp \cdots \amp a_{2n} \\
\vdots \amp \vdots \amp \amp \vdots \\
a_{m1} \amp a_{m2} \amp \cdots \amp a_{mn}
\end{bmatrix},
\end{equation*}
and assume \(A\) can be row reduced to an upper triangular form using only row operation 3. Thus, in particular, \(a_{11}\neq 0\text{.}\) Multiply on the left by
\begin{equation*}
E_{1}=
\begin{bmatrix}
1 \amp 0 \amp \cdots \amp 0 \\
-
\frac{a_{21}}{a_{11}} \amp 1 \amp \cdots \amp 0 \\
\vdots \amp \vdots \amp \ddots \amp \vdots \\
-\frac{a_{m1}}{a_{11}} \amp 0 \amp \cdots \amp 1
\end{bmatrix}.
\end{equation*}
This is the product of elementary matrices which make modifications in the first column only. It is equivalent to taking \(-a_{21}/a_{11}\) times the first row and adding to the second. Then taking \(-a_{31}/a_{11}\) times the first row and adding to the third and so forth. The quotients in the first column of the above matrix are the multipliers. Thus the result is of the form
\begin{equation*}
E_{1}A=
\begin{bmatrix}
a_{11} \amp a_{12} \amp \cdots \amp a_{1n}^{\prime } \\
0 \amp a_{22}^{\prime } \amp \cdots \amp a_{2n}^{\prime } \\
\vdots \amp \vdots \amp \amp \vdots \\
0 \amp a_{m2}^{\prime } \amp \cdots \amp a_{mn}^{\prime }
\end{bmatrix}.
\end{equation*}
By assumption, \(a_{22}^{\prime }\neq 0\) and so it is possible to use this entry to zero out all the entries below it in the matrix on the right by multiplication by a matrix of the form
\begin{equation*}
E_{2}=
\begin{bmatrix}
1 \amp \mathbf{0} \\
\mathbf{0} \amp E
\end{bmatrix},
\end{equation*}
where \(E\) is an \(( m-1)\times ( m-1)\) matrix of the form
\begin{equation*}
E=\begin{bmatrix}
1 \amp 0 \amp \cdots \amp 0 \\
-\frac{a_{32}^{\prime }}{a_{22}^{\prime }} \amp 1 \amp \cdots \amp 0 \\
\vdots \amp \vdots \amp \ddots \amp \vdots \\
-\frac{a_{m2}^{\prime }}{a_{22}^{\prime }} \amp 0 \amp \cdots \amp 1
\end{bmatrix}.
\end{equation*}
Again, the entries in the first column below the 1 are the multipliers. Continuing this way, zeroing out the entries below the diagonal entries, finally leads us to
\begin{equation*}
E_{m-1}E_{m-2}\cdots E_{1}A=U,
\end{equation*}
where \(U\) is upper triangular. Each \(E_{j}\) has all ones down the main diagonal and is lower triangular. Now we can multiply both sides by the inverses of the \(E_{j}\) in the reverse order. This yields
\begin{equation*}
A=E_{1}^{-1}E_{2}^{-1}\cdots E_{m-1}^{-1}U.
\end{equation*}
By
Lemma 3.4.8, this implies that the product of those
\(E_{j}^{-1}\) is a lower triangular matrix having all ones down the main diagonal.
The above discussion and lemma explain how the multiplier method works. The expressions
\begin{equation*}
-\frac{a_{21}}{a_{11}},-\frac{a_{31}}{a_{11}},\cdots, -\frac{a_{m1}}{a_{11}},
\end{equation*}
which we obtained in building \(E_{1}\text{,}\) and which we denote respectively by \(M_{21},\cdots ,M_{m1}\) to save notation, are the multipliers. \index{multipliers} Therefore, according to the lemma, to find \(E_{1}^{-1}\) we simply write
\begin{equation*}
\begin{bmatrix}
1 \amp 0 \amp \cdots \amp 0 \\
-M_{21} \amp 1 \amp \cdots \amp 0 \\
\vdots \amp \vdots \amp \ddots \amp \vdots \\
-M_{m1} \amp 0 \amp \cdots \amp 1
\end{bmatrix}.
\end{equation*}
Similar considerations apply to the other \(E_{j}^{-1}.\) Thus \(L\) is a product of the form
\begin{equation*}
\begin{bmatrix}
1 \amp 0 \amp \cdots \amp 0 \\
-M_{21} \amp 1 \amp \cdots \amp 0 \\
\vdots \amp \vdots \amp \ddots \amp \vdots \\
-M_{m1} \amp 0 \amp \cdots \amp 1
\end{bmatrix}
\cdots
\begin{bmatrix}
1 \amp 0 \amp \cdots \amp 0 \\
0 \amp 1 \amp \cdots \amp 0 \\
\vdots \amp 0 \amp \ddots \amp \vdots \\
0 \amp \cdots \amp -M_{m(m-1)} \amp 1
\end{bmatrix},
\end{equation*}
where each factor has at most one nonzero column, the position of which moves from left to right as we scan the above product of matrices from left to right. It follows from what we know about the effect of multiplying on the left by an elementary matrix that the above product is of the form
\begin{equation*}
\begin{bmatrix}
1 \amp 0 \amp \cdots \amp 0 \amp 0 \\
-M_{21} \amp 1 \amp \cdots \amp 0 \amp 0 \\
\vdots \amp -M_{32} \amp \ddots \amp \vdots \amp \vdots \\
-M_{(M-1)1} \amp \vdots \amp \cdots \amp 1 \amp 0 \\
-M_{M1} \amp -M_{M2} \amp \cdots \amp -M_{MM-1} \amp 1
\end{bmatrix}.
\end{equation*}
To sum up the procedure, beginning at the left column and moving toward the right, you simply insert, into the corresponding position in the identity matrix, \(-1\) times the multiplier which was used to zero out an entry in that position below the main diagonal in \(A,\) while retaining the main diagonal which consists entirely of ones. This will give us \(L\) as we create \(U\) using row operation 3.
We now return to
Example 3.4.2, to understand why we could not use the multiplier method to find an
\(LU\) factorization of the coefficient matrix. Suppose we had written
\begin{equation*}
\begin{bmatrix}1\amp 0\amp 0\\0\amp 1\amp 0\\0\amp 0\amp 1\end{bmatrix}\begin{bmatrix}0\amp 12\amp -18\\1\amp 2\amp -3\\1\amp -2\amp 1\end{bmatrix}.
\end{equation*}
Our first step of Gaussian elimination would require a row swap to get a nonzero entry into the top left corner (we swapped rows 1 and 2 in
Example 3.4.2). Unfortunately, the elementary matrix that accomplishes this,
\begin{equation*}
P=\begin{bmatrix}0\amp 1\amp 0\\1\amp 0\amp 0\\0\amp 0\amp 1\end{bmatrix} \quad \text{is NOT lower triangular.}
\end{equation*}
So we cannot apply Lemma
Lemma 3.4.8 to generate a lower triangular
\(L\text{.}\)
If the elementary matrices used to reduce our matrix to row-echelon form are all lower triangular, then we can find an \(LU\) factorization. But what about the general case?
Theorem 3.4.9.
Suppose an \(m \times n\) matrix \(A\) is transformed to a row-echelon matrix \(U\) using Gaussian elimination. Let \(P_1, P_2, \ldots, P_s\) be the elementary matrices corresponding (in order) to the row interchanges used, and write \(P=P_s \cdots P_2 P_1\text{.}\) (If no interchanges are used take \(P = I_M\text{.}\)) Then:
\(PA\) is the matrix obtained from \(A\) by doing these interchanges (in order) to A.
\(PA\) has an \(LU\) factorization.
For a proof of the above theorem, the reader is referred to [Nicholson].
The matrix \(P\) in the above theorem is called a permutation matrix. These matrices have other important applications, as we will see later.