Extra Topic: Positive Definite Matrices

Section 10.7 Extra Topic: Positive Definite Matrices

All the eigenvalues of any symmetric matrix are real (deduced in later sections; this section is about the case in which those eigenvalues are positive. These matrices, which arise whenever optimization (maximum and minimum) problems are encountered, have countless applications throughout science and engineering. They also arise in statistics (for example, in factor analysis used in the social sciences) and in geometry (quadratic forms, for instance).

Definition 10.7.1.

A square matrix is called positive definite if it is symmetric and all its eigenvalues \(\lambda\) are positive. We write \(\lambda\gt 0\) when eigenvalues are real and positive.

Because these matrices are symmetric, the Real Spectral (Theorem 10.4.8) plays a central role in the theory.

Theorem 10.7.2.

If \(A\) is positive definite, then it is invertible and \(\det A \gt 0\text{.}\)

Proof.

If \(A\) is \(n \times n\) and the eigenvalues are \(\lambda_{1}, \lambda_{2}, \dots, \lambda_{n}\text{,}\) then

\begin{equation*} \det A = \lambda_{1}\lambda_{2} \cdots \lambda_{n} \gt 0 \end{equation*}

by the Real Spectral Theorem.

If \(\mathbf{x}\) is a column in \(\R^n\) and \(A\) is any real \(n \times n\) matrix, we view the \(1 \times 1\) matrix \(\mathbf{x}^{T}A\mathbf{x}\) as a real number. With this convention, we have the following characterization of positive definite matrices.

Theorem 10.7.3.

A symmetric matrix \(A\) is positive definite if and only if \(\mathbf{x}^{T} A \mathbf{x} \gt 0\) for every column \(\mathbf{x} \neq \mathbf{0}\) in \(\R^n\text{.}\)

Proof.

\(A\) is symmetric. Using the Real Spectral Theorem, let

\begin{equation*} Q^{T}AQ = D = \mbox{diag}(\lambda_{1}, \lambda_{2}, \dots, \lambda_{n}) \end{equation*}

where \(Q^{-1} = Q^{T}\) and the \(\lambda_{i}\) are the eigenvalues of \(A\text{.}\) Given a column \(\mathbf{x}\) in \(\R^n\text{,}\) write

\begin{equation*} \mathbf{y} = Q^{T}\mathbf{x} = \left[ \begin{array}{cccc} y_{1} \amp y_{2} \amp \dots \amp y_{n} \end{array}\right]^T. \end{equation*}

Then

\begin{align} \mathbf{x}^TA\mathbf{x} \amp = \mathbf{x}^T(QDQ^T)\mathbf{x} \notag\\ \amp = \mathbf{y}^TD\mathbf{y} \notag\\ \amp = \lambda_{1}y_{1}^2 + \lambda_{2}y_{2}^2 + \dots + \lambda_{n}y_{n}^2. \tag{10.7.1} \end{align}

If \(A\) is positive definite and \(\mathbf{x} \neq \mathbf{0}\text{,}\) then \(\mathbf{x}^{T}A\mathbf{x} \gt 0\) by (10.7.1) because some \(y_{j} \neq 0\) and every \(\lambda_{i} \gt 0\text{.}\)

Conversely, if \(\mathbf{x}^{T}A\mathbf{x} \gt 0\) whenever \(\mathbf{x} \neq \mathbf{0}\text{,}\) let \(\mathbf{x} = Q\mathbf{e}_{j} \neq \mathbf{0}\) where \(\mathbf{e}_{j}\) is column \(j\) of \(I_{n}\text{.}\) Then \(\mathbf{y} = \mathbf{e}_{j}\text{,}\) so (10.7.1) reads \(\lambda_{j} = \mathbf{x}^{T}A\mathbf{x} \gt 0\text{.}\)

Note that Theorem 10.7.3 shows that the positive definite matrices are exactly the symmetric matrices \(A\) for which the quadratic form \(q = \mathbf{x}^{T}A\mathbf{x}\) takes only positive values.

Example 10.7.4.

If \(U\) is any invertible \(n \times n\) matrix, show that \(A = U^{T}U\) is positive definite.

Answer.

If \(\mathbf{x}\) is in \(\R^n\) and \(\mathbf{x} \neq \mathbf{0}\text{,}\) then

\begin{equation*} \mathbf{x}^TA\mathbf{x} = \mathbf{x}^T(U^TU)\mathbf{x} = (U\mathbf{x})^T(U\mathbf{x})= \norm{ U\mathbf{x} }^2 \gt 0, \end{equation*}

because \(U\mathbf{x} \neq \mathbf{0}\) (\(U\) is invertible). Hence Theorem 10.7.3 applies.

It is remarkable that the converse to Example 10.7.4 is also true. In fact every positive definite matrix \(A\) can be factored as \(A = U^{T}U\) where \(U\) is an upper triangular matrix with positive elements on the main diagonal. However, before verifying this, we introduce another concept that is central to any discussion of positive definite matrices.

If \(A\) is any \(n \times n\) matrix, let \(^{(r)}A\) denote the \(r \times r\) submatrix in the upper left corner of \(A\text{;}\) that is, \(^{(r)}A\) is the matrix obtained from \(A\) by deleting the last \(n - r\) rows and columns.

The matrices \(^{(1)}A, ^{(2)}A, ^{(3)}A, \dots\text{,}\) \(^{(n)}A = A\) are called the principal submatrices of \(A\text{.}\)

To get accustomed to principal matrices, we will analyse a concrete case.

Example 10.7.5.

If \(A = \left[ \begin{array}{rrr} 10 \amp 5 \amp 2 \\ 5 \amp 3 \amp 2 \\ 2 \amp 2 \amp 3 \end{array}\right]\) then

\begin{equation*} ^{(1)}A = \left[ 10 \right], \quad ^{(2)}A = \left[ \begin{array}{rr} 10 \amp 5 \\ 5 \amp 3 \end{array}\right] \text{ and } ^{(3)}A = A. \end{equation*}

Lemma 10.7.6.

If \(A\) is positive definite, so is each principal submatrix \(^{(r)}A\) for \(r = 1, 2, \dots, n\text{.}\)

Proof.

Write

\begin{equation*} A = \left[ \begin{array}{rr} ^{(r)}A \amp P \\ Q \amp R \end{array}\right] \end{equation*}

in block form. If \(\mathbf{y} \neq \mathbf{0}\) in \(\R^r\text{,}\) write

\begin{equation*} \mathbf{x} = \left[ \begin{array}{r} \mathbf{y} \\ \mathbf{0} \end{array}\right] \end{equation*}

in \(\R^n\text{.}\) Then \(\mathbf{x} \neq \mathbf{0}\text{,}\) so the fact that \(A\) is positive definite gives

\begin{equation*} 0 \lt \mathbf{x}^TA\mathbf{x} = \left[ \begin{array}{rr} \mathbf{y}^T \amp \mathbf{0} \end{array}\right] \left[ \begin{array}{rr} ^{(r)}A \amp P \\ Q \amp R \end{array}\right] \left[ \begin{array}{r} \mathbf{y} \\ \mathbf{0} \end{array}\right] = \mathbf{y}^T(^{(r)}A)\mathbf{y}. \end{equation*}

This shows that \(^{(r)}A\) is positive definite by Theorem 10.7.3. A similar argument shows that, if \(B\) is any matrix obtained from a positive definite matrix \(A\) by deleting certain rows and deleting the same columns, then \(B\) is also positive definite.

If \(A\) is positive definite, Lemma 10.7.6 and Theorem 10.7.2 show that \(\mbox{det}(^{(r)}A) \gt 0\) for every \(r\text{.}\) This proves part of the following theorem which contains the converse to Example 10.7.4, and characterizes the positive definite matrices among the symmetric ones.

Theorem 10.7.7.

The following conditions are equivalent for a symmetric \(n \times n\) matrix \(A\text{:}\)

\(A\) is positive definite.
\(\mbox{det}(^{(r)}A) \gt 0\) for each \(r = 1, 2, \dots, n\text{.}\)
\(A = U^{T}U\) where \(U\) is an upper triangular matrix with positive entries on the main diagonal.

Furthermore, the factorization in Item 3 is unique (called the Cholesky factorization of \(A\)).

Proof.

First, Item 3 \(\Rightarrow\) Item 1 by Example 10.7.4, and Item 1 \(\Rightarrow\) Item 2 by Lemma 10.7.6 and Theorem 10.7.2.

Item 2 \(\Rightarrow\) Item 3. Assume Item 2 and proceed by induction on \(n\text{.}\) If \(n = 1\text{,}\) then \(A = \left[ a \right]\) where \(a \gt 0\) by Item 2, so take \(U = \left[ \sqrt{a} \right]\text{.}\) If \(n \gt 1\text{,}\) write \(B =^{(n-1)}A\text{.}\) Then \(B\) is symmetric and satisfies Item 2 so, by induction, we have \(B = U^{T}U\) as in Item 3 where \(U\) is of size \((n - 1) \times (n - 1)\text{.}\) Then, as \(A\) is symmetric, it has block form

\begin{equation*} A = \left[ \begin{array}{cc} B \amp \mathbf{p} \\ \mathbf{p}^T \amp b \end{array}\right] \end{equation*}

where \(\mathbf{p}\) is a column in \(\R^{n-1}\) and \(b\) is in \(\R\text{.}\) If we write \(\mathbf{x} = (U^{T})^{-1}\mathbf{p}\) and \(c = b - \mathbf{x}^{T}\mathbf{x}\text{,}\) block multiplication gives

\begin{equation*} A = \left[ \begin{array}{cc} U^TU \amp \mathbf{p} \\ \mathbf{p}^T \amp b \end{array}\right] = \left[ \begin{array}{cc} U^T \amp 0 \\ \mathbf{x}^T \amp 1 \end{array}\right] \left[ \begin{array}{cc} U \amp \mathbf{x} \\ 0 \amp c \end{array}\right]. \end{equation*}

Taking determinants gives \(\det A = \mbox{det}(U^{T}) \mbox{det} U \cdot c = c(\mbox{det}U)^{2}\text{.}\) Hence \(c \gt 0\) because \(\det A \gt 0\) by Item 2, so the above factorization can be written

\begin{equation*} A = \left[ \begin{array}{cc} U^T \amp 0 \\ \mathbf{x}^T \amp \sqrt{c} \end{array}\right] \left[ \begin{array}{cc} U \amp \mathbf{x} \\ 0 \amp \sqrt{c} \end{array}\right]. \end{equation*}

Since \(U\) has positive diagonal entries, this proves Item 3.

To prove uniqueness, suppose that \(A = U^TU = U_{1}^TU_{1}\) are two Cholesky factorizations. Now write

\begin{equation*} D = UU_{1}^{-1} = (U^T)^{-1}U_{1}^T. \end{equation*}

Then \(D\) is upper triangular, because \(D = UU_{1}^{-1}\text{,}\) and lower triangular, because

\begin{equation*} D = (U^T)^{-1}U_{1}^T, \end{equation*}

and so it is a diagonal matrix. Thus \(U = DU_{1}\) and \(U_{1} = DU\text{,}\) so it suffices to show that \(D = I\text{.}\) But eliminating \(U_{1}\) gives \(U = D^{2}U\text{,}\) so \(D^{2} = I\) because \(U\) is invertible. Since the diagonal entries of \(D\) are positive (this is true of \(U\) and \(U_{1}\)), it follows that \(D = I\text{.}\)

The remarkable thing is that the matrix \(U\) in the Cholesky factorization is easy to obtain from \(A\) using row operations. The key is that Step 1 of the following algorithm is possible for any positive definite matrix \(A\text{.}\) A proof of the algorithm is given following Example~Example 10.7.9.

Theorem 10.7.8. Algorithm for the Cholesky Factorization.

If \(A\) is a positive definite matrix, the Cholesky factorization \(A = U^{T}U\) can be obtained as follows:

[Step 1.] Carry \(A\) to an upper triangular matrix \(U_{1}\) with positive diagonal entries using row operations each of which adds a multiple of a row to a lower row.
[Step 2.] Obtain \(U\) from \(U_{1}\) by dividing each row of \(U_{1}\) by the square root of the diagonal entry in that row.

Proof.

If \(A\) is positive definite, let \(A = U^{T}U\) be the Cholesky factorization, and let \(D = \mbox{diag}(d_{1}, \dots, d_{n})\) be the common diagonal of \(U\) and \(U^{T}\text{.}\) Then \(U^{T}D^{-1}\) is lower triangular with ones on the diagonal (call such matrices LT-1). Hence \(L = (U^{T}D^{-1})^{-1}\) is also LT-1, and so \(I_{n} \to L\) by a sequence of row operations each of which adds a multiple of a row to a lower row (modify columns right to left). But then \(A \to LA\) by the same sequence of row operations. Since

\begin{equation*} LA = [D(U^{T})^{-1}][U^{T}U] = DU- \end{equation*}

is upper triangular with positive entries on the diagonal, this shows that Step 1 of the algorithm is possible.

Turning to Step 2, let \(A \to U_{1}\) as in Step 1 so that \(U_{1} = L_{1}A\) where \(L_{1}\) is LT-1. Since A is symmetric, we get

\begin{equation} L_{1}U_{1}^T = L_{1}(L_{1}A)^T = L_{1}A^TL_{1}^T = L_{1}AL_{1}^T = U_{1}L_{1}^T\tag{10.7.2} \end{equation}

Let \(D_{1} = \mbox{diag}(e_{1}, \dots, e_{n})\) denote the diagonal of \(U_{1}\text{.}\) Then (10.7.2) gives

\begin{equation*} L_{1}(U_{1}^TD_{1}^{-1}) = U_{1}L_{1}^TD_{1}^{-1}. \end{equation*}

This is both upper triangular (right side) and LT-1 (left side), and so must equal \(I_{n}\text{.}\) In particular, \(U_{1}^TD_{1}^{-1} = L_{1}^{-1}\text{.}\) Now let

\begin{equation*} D_{2} = \mbox{diag}(\sqrt{e_{1}}, \dots, \sqrt{e_{n}}), \end{equation*}

so that \(D_{2}^2 = D_{1}\text{.}\) If we write \(U = D_{2}^{-1}U_{1}\) we have

\begin{align*} U^TU \amp = (U_{1}^TD_{2}^{-1})(D_{2}^{-1}U_{1}) \\ \amp = U_{1}^T(D_{2}^2)^{-1}U_{1} \\ \amp = (U_{1}^TD_{1}^{-1})U_{1} \\ \amp = (L_{1}^{-1})U_{1} = A. \end{align*}

This proves Step 2 because \(U = D_{2}^{-1}U_{1}\) is formed by dividing each row of \(U_{1}\) by the square root of its diagonal entry (verify).

Example 10.7.9.

Find the Cholesky factorization of

\begin{equation*} A = \left[ \begin{array}{rrr} 10 \amp 5 \amp 2 \\ 5 \amp 3 \amp 2 \\ 2 \amp 2 \amp 3 \end{array}\right]. \end{equation*}

Answer.

The matrix \(A\) is positive definite by Theorem 10.7.7 because \(\mbox{det}{^{(1)}A} = 10 \gt 0\text{,}\) \(\mbox{det}{^{(2)}A} = 5 \gt 0\text{,}\) and \(\mbox{det}{^{(3)}A} = \mbox{det} A = 3 \gt 0\text{.}\) Hence Step 1 of the algorithm is carried out as follows:

\begin{equation*} A = \left[ \begin{array}{rrr} 10 \amp 5 \amp 2 \\ 5 \amp 3 \amp 2 \\ 2 \amp 2 \amp 3 \end{array}\right] \rightarrow \left[ \begin{array}{rrc} 10 \amp 5 \amp 2 \\ 0 \amp \frac{1}{2} \amp 1 \\ 0 \amp 1 \amp \frac{13}{5} \end{array}\right] \rightarrow \left[ \begin{array}{rrr} 10 \amp 5 \amp 2 \\ 0 \amp \frac{1}{2} \amp 1 \\ 0 \amp 0 \amp \frac{3}{5} \end{array}\right] = U_{1}. \end{equation*}

Now carry out Step 2 on \(U_{1}\) to obtain

\begin{equation*} U = \left[ \def\arraystretch{1.5}\begin{array}{ccc} \sqrt{10} \amp \frac{5}{\sqrt{10}} \amp \frac{2}{\sqrt{10}} \\ 0 \amp \frac{1}{\sqrt{2}} \amp \sqrt{2} \\ 0 \amp 0 \amp \frac{\sqrt{3}}{\sqrt{5}} \end{array}\right]. \end{equation*}

The reader can verify that \(U^{T}U = A\text{.}\)

Exercises Exercises

1.

Find the Cholesky decomposition of each of the following matrices.

\begin{equation*} \left[ \begin{array}{rr} 4 \amp 3 \\ 3 \amp 5 \end{array}\right]. \end{equation*}
\begin{equation*} \left[ \begin{array}{rr} 2 \amp -1 \\ -1 \amp 1 \end{array}\right]. \end{equation*}
\(\displaystyle \left[ \begin{array}{rrr} 12 \amp 4 \amp 3 \\ 4 \amp 2 \amp -1 \\ 3 \amp -1 \amp 7 \end{array}\right]\)
\(\displaystyle \left[ \begin{array}{rrr} 20 \amp 4 \amp 5 \\ 4 \amp 2 \amp 3 \\ 5 \amp 3 \amp 5 \end{array}\right]\)

Answer.

For (b):

\begin{equation*} U = \frac{\sqrt{2}}{2} \left[ \begin{array}{rr} 2 \amp -1 \\ 0 \amp 1 \end{array}\right] \end{equation*}

and for (d):

\begin{equation*} U = \frac{1}{30} \left[ \begin{array}{ccc} 60\sqrt{5} \amp 12\sqrt{5} \amp 15\sqrt{5} \\ 0 \amp 6\sqrt{30} \amp 10\sqrt{30} \\ 0 \amp 0 \amp 5\sqrt{15} \end{array}\right]. \end{equation*}

2.

If \(A\) is positive definite, show that \(A^{k}\) is positive definite for all \(k \geq 1\text{.}\)
Prove the converse to (a) when \(k\) is odd.
Find a symmetric matrix \(A\) such that \(A^{2}\) is positive definite but \(A\) is not.

Hint.

For (b): If \(\lambda^{k} \gt 0\text{,}\) \(k\) odd, then \(\lambda \gt 0\text{.}\)

3.

Let

\begin{equation*} A = \left[ \begin{array}{rr} 1 \amp a \\ a \amp b \end{array}\right]. \end{equation*}

If \(a^{2} \lt b\text{,}\) show that \(A\) is positive definite and find the Cholesky factorization.

4.

If \(A\) and \(B\) are positive definite and \(r \gt 0\text{,}\) show that \(A + B\) and \(rA\) are both positive definite.

Answer.

If \(\mathbf{x} \neq \mathbf{0}\text{,}\) then \(\mathbf{x}^{T}A\mathbf{x} \gt 0\) and \(\mathbf{x}^{T}B\mathbf{x} \gt 0\text{.}\) Hence

\begin{equation*} \mathbf{x}^{T}(A + B)\mathbf{x} = \mathbf{x}^{T}A\mathbf{x} + \mathbf{x}^{T}B\mathbf{x} \gt 0 \end{equation*}

and \(\mathbf{x}^{T}(rA)\mathbf{x} = r(\mathbf{x}^{T}A\mathbf{x}) \gt 0\text{,}\) as \(r \gt 0\text{.}\)

5.

If \(A\) and \(B\) are positive definite, show that

\begin{equation*} \left[ \begin{array}{rr} A \amp 0 \\ 0 \amp B \end{array}\right] \end{equation*}

is positive definite.

6.

If \(A\) is an \(n \times n\) positive definite matrix and \(U\) is an \(n \times m\) matrix of rank \(m\text{,}\) show that \(U^{T}AU\) is positive definite.

Answer.

Let \(\mathbf{x} \neq \mathbf{0}\) in \(\R^n\text{.}\) Then \(\mathbf{x}^{T}(U^{T}AU)\mathbf{x} = (U\mathbf{x})^{T}A(U\mathbf{x}) \gt 0\) provided \(U\mathbf{x} \neq 0\text{.}\) But if

\begin{equation*} U = \left[ \begin{array}{cccc} \mathbf{c}_{1} \amp \mathbf{c}_{2} \amp \dots \amp \mathbf{c}_{n} \end{array}\right] \text{ and } \mathbf{x} = (x_{1}, x_{2}, \dots, x_{n}), \end{equation*}

then

\begin{equation*} U\mathbf{x} = x_{1}\mathbf{c}_{1} + x_{2}\mathbf{c}_{2} + \dots + x_{n}\mathbf{c}_{n} \neq \mathbf{0}, \end{equation*}

because \(\mathbf{x} \neq \mathbf{0}\) and the \(\mathbf{c}_{i}\) are independent.

7.

If \(A\) is positive definite, show that each diagonal entry is positive.

8.

Let \(A_{0}\) be formed from \(A\) by deleting rows 2 and 4 and deleting columns 2 and 4. If \(A\) is positive definite, show that \(A_{0}\) is positive definite.

9.

If \(A\) is positive definite, show that \\ \(A = CC^{T}\) where \(C\) has orthogonal columns.

10.

If \(A\) is positive definite, show that \(A = C^{2}\) where \(C\) is positive definite. Click the arrow for answer.

Answer.

Let \(P^{T}AP = D = \mbox{diag}(\lambda_{1}, \dots, \lambda_{n})\) where \(P^{T} = P\text{.}\) Since \(A\) is positive definite, each eigenvalue \(\lambda_{i} \gt 0\text{.}\) If \(B = \mbox{diag}(\sqrt{\lambda_{1}}, \dots, \sqrt{\lambda_{n}})\) then \(B^{2} = D\text{,}\) so \(A = PB^{2}P^{T} = (PBP^{T})^{2}\text{.}\) Take \(C = PBP^{T}\text{.}\) Since \(C\) has eigenvalues \(\sqrt{\lambda_{i}} \gt 0\text{,}\) it is positive definite.

11.

Let \(A\) be a positive definite matrix. If \(a\) is a real number, show that \(aA\) is positive definite if and only if \(a \gt 0\text{.}\)

12.

Suppose an invertible matrix \(A\) can be factored in \(\mathbb{M}_{nn}\) as \(A = LDU\) where \(L\) is lower triangular with \(1\)s on the diagonal, \(U\) is upper triangular with \(1\)s on the diagonal, and \(D\) is diagonal with positive diagonal entries. Show that the factorization is unique: If \(A = L_{1}D_{1}U_{1}\) is another such factorization, show that \(L_{1} = L\text{,}\) \(D_{1} = D\text{,}\) and \(U_{1} = U\text{.}\)
Show that a matrix \(A\) is positive definite if and only if \(A\) is symmetric and admits a factorization \(A = LDU\) as in (a).

Answer.

If \(A\) is positive definite, use Theorem 10.7.2 to write \(A = U^{T}U\) where \(U\) is upper triangular with positive diagonal \(D\text{.}\) Then

\begin{equation*} A = (D^{-1}U)^{T}D^{2}(D^{-1}U) \end{equation*}

so \(A = L_{1}D_{1}U_{1}\) is such a factorization if \(U_{1} = D^{-1}U\text{,}\) \(D_{1} = D^{2}\text{,}\) and \(L_{1} = U^T_1\text{.}\)

Conversely, let \(A^{T} = A = LDU\) be such a factorization. Then

\begin{equation*} U^{T}D^{T}L^{T} = A^{T} = A = LDU, \end{equation*}

so \(L = U^{T}\) by \textbf{(a)}. Hence \(A = LDL^{T} = V^{T}V\) where \(V = LD_{0}\) and \(D_{0}\) is diagonal with \(D^2_0 = D\) (the matrix \(D_{0}\) exists because \(D\) has positive diagonal entries). Hence \(A\) is symmetric, and it is positive definite by Example 10.7.4.

13.

Let \(A\) be positive definite and write \(d_{r} = \det{\left(^{(r)}A\right)}\) for each \(r = 1, 2, \dots, n\text{.}\) If \(U\) is the upper triangular matrix obtained in step 1 of the algorithm, show that the diagonal elements \(u_{11}, u_{22}, \dots, u_{nn}\) of \(U\) are given by \(u_{11} = d_{1}\text{,}\) \(u_{jj} = d_{j} / d_{j-1}\) if \(j \gt 1\text{.}\)

Answer.

If \(LA = U\) where \(L\) is lower triangular with \(1\)s on the diagonal, use block multiplication to show that

\begin{equation*} \det{\left(^{(r)}A\right)} = \det{\left(^{(r)}U\right)} \end{equation*}

for each \(r\text{.}\)

Prev Top Next