Before proceeding to explore these properties, we pause to introduce a simple matrix function that we will continue to use throughout the course. Another important quantity we can compute is called the trace of a matrix.
The following theorem lists a number of properties shared by similar matrices.
With all the abstract business running around, let us turn to a concretete example.
The next theorem shows that similarity is preserved under inverses, transposes, and powers:
Subsection 8.2.1 Diagonalizable Matrices and Multiplicity
Recall that a diagonal matrix \(D\) is a matrix containing a zero in every entry except those on the main diagonal. More precisely, if \(d_{ij}\) is the \(ij^{th}\) entry of a diagonal matrix \(D\text{,}\) then \(d_{ij}=0\) unless \(i=j\text{.}\) Such matrices look like the following.
\begin{equation*}
D =
\begin{bmatrix}
* \amp \amp 0 \\
\amp \ddots \amp \\
0 \amp \amp *
\end{bmatrix}
\end{equation*}
where \(*\) is a number which might not be zero. Diagonal matrices have some nice properties, as we demonstrate below.
Exploration 8.2.1.
Let us warm up with a small computation.
Problem 8.2.10.
Let
\begin{equation*}
M =\begin{bmatrix}1 \amp 2 \amp 3\\ 4\amp 5\amp 6\\7\amp 8\amp 9\end{bmatrix} \quad \text{and} \quad D =\begin{bmatrix}2 \amp 0 \amp 0\\ 0\amp -5\amp 0\\0\amp 0\amp 10\end{bmatrix}.
\end{equation*}
Compute \(MD\) and \(DM\text{.}\)
Answer.
\begin{equation*}
MD = \begin{bmatrix} 2 \amp -10 \amp 30\\ 8\amp -25\amp 60\\14\amp -40\amp 90\end{bmatrix}, \quad DM= \begin{bmatrix} 2 \amp 4 \amp 6\\ -20\amp -25\amp -30\\70\amp 80\amp 90\end{bmatrix}.
\end{equation*}
Notice the patterns present in the product matrices. Each row of \(DM\) is the same as its corresponding row of \(M\) multiplied by the scalar which is the corresponding diagonal element of \(D\text{.}\) In the product \(MD\text{,}\) it is the columns of \(M\) that have been multiplied by the diagonal elements. These patterns hold in general for any diagonal matrix, and they are fundamental to understanding diagonalization, the process we discuss below.
Definition 8.2.11.
Let \(A\) be an \(n\times n\) matrix. Then \(A\) is said to be diagonalizable if there exists an invertible matrix \(P\) such that
\begin{equation*}
P^{-1}AP=D,
\end{equation*}
where \(D\) is a diagonal matrix. In other words, a matrix \(A\) is diagonalizable if it is similar to a diagonal matrix, \(A \sim D\text{.}\)
If we are given a matrix \(A\) that is diagonalizable, then we can write \(P^{-1}AP=D\) for some matrix \(P\text{,}\) or, equivalently,
\begin{equation}
AP=PD \tag{8.2.1}
\end{equation}
If we pause to examine
(8.2.1), the work that we did in
Exploration 8.2.1 can help us to understand how to find
\(P\) that will diagonalize
\(A\text{.}\) The product
\(PD\) is formed by multiplying each column of
\(P\) by a scalar which is the corresponding element on the diagonal of
\(D\text{.}\) To restate this, if
\(\mathbf{x}_i\) is column
\(i\) in our matrix
\(P\text{,}\) then
(8.2.1) tells us that
\begin{equation}
A \mathbf{x}_i = \lambda_i \mathbf{x}_i, \tag{8.2.2}
\end{equation}
where
\(\lambda_i\) is the
\(i\)th diagonal element of
\(D\text{.}\) Of course,
(8.2.2) is very familiar! We see that if we are able to diagonalize a matrix
\(A\text{,}\) the columns of matrix
\(P\) will be the eigenvectors of
\(A\text{,}\) and the corresponding diagonal entries of
\(D\) will be the corresponding eigenvalues of
\(A\text{.}\) This is summed up in the following theorem.
Theorem 8.2.12.
An \(n\times n\) matrix \(A\) is diagonalizable if and only if there is an invertible matrix \(P\) given by
\begin{equation*}
P=\begin{bmatrix}
| \amp | \amp \amp | \\
\mathbf{x}_1 \amp \mathbf{x}_2 \amp \cdots \amp \mathbf{x}_n \\
| \amp | \amp \amp |
\end{bmatrix},
\end{equation*}
where the columns \(\mathbf{x}_i\) are eigenvectors of \(A\text{.}\) Moreover, if \(A\) is diagonalizable, the corresponding eigenvalues of \(A\) are the diagonal entries of the diagonal matrix \(D\text{.}\)
Proof.
Suppose \(P\) is given as above as an invertible matrix whose columns are eigenvectors of \(A\text{.}\) To show that \(A\) is diagonalizable, we will show
\begin{equation*}
AP=PD,
\end{equation*}
which is equivalent to \(P^{-1}AP=D\text{.}\) We have
\begin{equation*}
AP=\begin{bmatrix}
| \amp | \amp \amp | \\
A\mathbf{x}_1 \amp A\mathbf{x}_2 \amp \cdots \amp A\mathbf{x}_n \\
| \amp | \amp \amp |
\end{bmatrix},
\end{equation*}
while
\begin{align*}
PD \amp =\begin{bmatrix}
| \amp | \amp \amp | \\
\mathbf{x}_1 \amp \mathbf{x}_2 \amp \cdots \amp \mathbf{x}_n \\
| \amp | \amp \amp |
\end{bmatrix}
\begin{bmatrix}
\lambda _{1} \amp \amp 0 \\
\amp \ddots \amp \\
0 \amp \amp \lambda _{n}
\end{bmatrix} \\
\amp =\begin{bmatrix}
| \amp | \amp \amp | \\
\lambda _{1}\mathbf{x}_1 \amp \lambda _{2}\mathbf{x}_2 \amp \cdots \amp \lambda_{n}\mathbf{x}_n \\
| \amp | \amp \amp |
\end{bmatrix}.
\end{align*}
We can complete this half of the proof by comparing columns, and noting that
\begin{equation}
A \mathbf{x}_i = \lambda_i \mathbf{x}_i \tag{8.2.3}
\end{equation}
for \(i=1,\ldots,n\) since the \(\mathbf{x}_i\) are eigenvectors of \(A\) and the \(\lambda_i\) are corresponding eigenvalues of \(A\text{.}\)
Conversely, suppose \(A\) is diagonalizable so that \(P^{-1}AP=D.\) Let
\begin{equation*}
P=\begin{bmatrix}
| \amp | \amp \amp | \\
\mathbf{x}_1 \amp \mathbf{x}_2 \amp \cdots \amp \mathbf{x}_n \\
| \amp | \amp \amp |
\end{bmatrix}
\end{equation*}
where the columns are the vectors \(\mathbf{x}_i\) and
\begin{equation*}
D=\begin{bmatrix}
\lambda _{1} \amp \amp 0 \\
\amp \ddots \amp \\
0 \amp \amp \lambda _{n}
\end{bmatrix}.
\end{equation*}
Then
\begin{equation*}
AP=PD=\begin{bmatrix}
| \amp | \amp \amp | \\
\mathbf{x}_1 \amp \mathbf{x}_2 \amp \cdots \amp \mathbf{x}_n \\
| \amp | \amp \amp |
\end{bmatrix} \begin{bmatrix}
\lambda _{1} \amp \amp 0 \\
\amp \ddots \amp \\
0 \amp \amp \lambda _{n}
\end{bmatrix}
\end{equation*}
and so
\begin{equation*}
\begin{bmatrix}
| \amp | \amp \amp | \\
A\mathbf{x}_1 \amp A\mathbf{x}_2 \amp \cdots \amp A\mathbf{x}_n \\
| \amp | \amp \amp |
\end{bmatrix} =\begin{bmatrix}
| \amp | \amp \amp | \\
\lambda _{1}\mathbf{x}_1 \amp \lambda _{2}\mathbf{x}_2 \amp \cdots \amp \lambda_{n}\mathbf{x}_n \\
| \amp | \amp \amp |
\end{bmatrix},
\end{equation*}
showing the \(\mathbf{x}_i\) are eigenvectors of \(A\) and the \(\lambda _{i}\) are eigenvalues.
Notice that because the matrix \(P\) defined above is invertible it follows that the set of eigenvectors of \(A\text{,}\) \(\left\{ \mathbf{x}_1 , \mathbf{x}_2 , \ldots, , \mathbf{x}_n \right\}\text{,}\) is a basis of \(\mathbb{R}^n\text{.}\)
We demonstrate the concept given in the above theorem in the next example. Note that not only are the columns of the matrix \(P\) formed by eigenvectors, but \(P\) must be invertible, and therefore must consist of a linearly independent set of eigenvectors.
Example 8.2.13.
Let
\begin{equation*}
A=\begin{bmatrix}
2 \amp 0 \amp 0 \\
1 \amp 4 \amp -1 \\
-2 \amp -4 \amp 4
\end{bmatrix}
\end{equation*}
Find an invertible matrix \(P\) and a diagonal matrix \(D\) such that \(P^{-1}AP=D\text{.}\)
Answer.
We will use eigenvectors of \(A\) as the columns of \(P\text{,}\) and the corresponding eigenvalues of \(A\) as the diagonal entries of \(D\text{.}\) The eigenvalues of \(A\) are \(\lambda_1 =2,\lambda_2 = 2\text{,}\) and \(\lambda_3 = 6\text{.}\) We leave these computations as exercises, as well as the computations to find a basis for each eigenspace. One possible basis for \(\mathcal{S}_2\text{,}\) the eigenspace corresponding to \(2\text{,}\) is
\begin{equation*}
\left \lbrace
\begin{bmatrix}
-2 \\
1 \\
0
\end{bmatrix},
\begin{bmatrix}
1 \\
0 \\
1
\end{bmatrix}
\right \rbrace,
\end{equation*}
while a basis for \(\mathcal{S}_6\) is given by
\begin{equation*}
\left \lbrace \begin{bmatrix}
0 \\
1 \\
-2
\end{bmatrix}\right \rbrace \text{.}
\end{equation*}
We construct the matrix \(P\) by using these basis elements as columns.
\begin{equation*}
P=\begin{bmatrix}
-2 \amp 1 \amp 0 \\
1 \amp 0 \amp 1 \\
0 \amp 1 \amp -2
\end{bmatrix}
\end{equation*}
You can verify (and will do so during exercise) that
\begin{equation*}
P^{-1}=\begin{bmatrix}
-1/4 \amp 1/2 \amp 1/4 \\
1/2 \amp 1 \amp 1/2 \\
1/4 \amp 1/2 \amp -1/4
\end{bmatrix}
\end{equation*}
Thus,
\begin{align*}
P^{-1}AP \amp = \begin{bmatrix}
-1/4 \amp 1/2 \amp 1/4 \\
1/2 \amp 1 \amp 1/2 \\
1/4 \amp 1/2 \amp -1/4
\end{bmatrix} \begin{bmatrix}
2 \amp 0 \amp 0 \\
1 \amp 4 \amp -1 \\
-2 \amp -4 \amp 4
\end{bmatrix} \begin{bmatrix}
-2 \amp 1 \amp 0 \\
1 \amp 0 \amp 1 \\
0 \amp 1 \amp -2
\end{bmatrix} \\
\amp =\begin{bmatrix}
2 \amp 0 \amp 0 \\
0 \amp 2 \amp 0 \\
0 \amp 0 \amp 6
\end{bmatrix}
\end{align*}
You can see that the result here is a diagonal matrix where the entries on the main diagonal are the eigenvalues of \(A\text{.}\) Notice that eigenvalues on the main diagonal must be in the same order as the corresponding eigenvectors in \(P\text{.}\)
It is often easier to work with matrices that are diagonalizable, as the next Exploration demonstrates.
Exploration 8.2.2.
Let
\begin{equation*}
A=\begin{bmatrix}
2 \amp 0 \amp 0 \\
1 \amp 4 \amp -1 \\
-2 \amp -4 \amp 4
\end{bmatrix} \quad \text{and} \quad D=\begin{bmatrix}
2 \amp 0 \amp 0 \\
0 \amp 2 \amp 0 \\
0 \amp 0 \amp 6
\end{bmatrix}.
\end{equation*}
Would it be easier to compute \(A^5\) or \(D^5\) if you had to do so by hand, without a computer? Certainly \(D^5\) is easier, due to the number of zero entries!
Problem 8.2.14.
Do compute \(A^5\) together with \(D^5\text{.}\) Feel free to use a program or online calculator tool for \(A^5 \text{,}\) but do \(D^5 \) by hand.
We see that raising a diagonal matrix to a power amounts to simply raising each entry to that same power, whereas computing
\(A^5\) requires many more calculations. However, we learned in
Example 8.2.13 that
\(A\) is similar to
\(D\text{,}\) and we can use this to make our computation easier. This is because
\begin{align*}
A^5\amp =\left(PDP^{-1}\right)^5 \\
\amp =(PDP^{-1})(PDP^{-1})(PDP^{-1})(PDP^{-1})(PDP^{-1}) \\
\amp =PD(P^{-1}P)D(P^{-1}P)D(P^{-1}P)D(P^{-1}P)DP^{-1} \\
\amp =PD(I)D(I)D(I)D(I)DP^{-1} \\
\amp =PD^5P^{-1}.
\end{align*}
With this in mind, it is not as daunting to calculate
\(A^5\) by hand. We can compute the product
\(PD^5\) quite easily since
\(D^5\) is diagonal, as we learned in
Exploration 8.2.1. That leaves just one product of
\(3 \times 3\) matrices to compute by hand to compute
\(A^5\text{.}\) And the savings in work would certainly be more pronounced for larger matrices or for powers larger that 5.
In
Exploration 8.2.2, because matrix
\(A\) was diagonalizable, we were able to cut down on computations. When we chose to work with
\(D\) and
\(P\) instead of
\(A\) we worked with the eigenvalues and eigenvectors of
\(A\text{.}\) Each column of
\(P\) is an eigenvector of
\(A\text{,}\) and so we repeatedly made use of the following theorem (with
\(m=5\)).
Theorem 8.2.15.
Let \(A\) be an \(n \times n\) matrix and suppose \(A\mathbf{x}=\lambda \mathbf{x}\text{.}\) Then
\begin{equation*}
A^m \mathbf{x} = \lambda^m \mathbf{x}.
\end{equation*}
Proof.
We prove this theorem by induction on \(m\text{.}\) Clearly \(A^m \mathbf{x} = \lambda^m \mathbf{x}\) holds when \(m=1\text{,}\) as that was given. For the inductive step, suppose that we know \(A^{m-1} \mathbf{x} = \lambda^{m-1} \mathbf{x}\text{.}\) Then
\begin{align*}
A^m \mathbf{x} \amp = (A A^{m-1}) \mathbf{x} \\
\amp = A (A^{m-1} \mathbf{x}) \\
\amp = A (\lambda^{m-1} \mathbf{x}) \\
\amp = \lambda^{m-1} A\mathbf{x} \\
\amp = \lambda^{m-1} \lambda \mathbf{x} \\
\amp = \lambda^m \mathbf{x}.
\end{align*}
as desired.
Matrix
\(A\) from the
Example 8.2.13 and
Exploration 8.2.2 had a repeated eigenvalue of 2. The next theorem and corollary show that matrices which have
distinct eigenvalues (where none are repeated) have desirable properties.
Theorem 8.2.16.
Let \(A\) be an \(n\times n\) matrix, and suppose that \(A\) has distinct eigenvalues \(\lambda_1, \lambda_2, \ldots, \lambda_m\text{.}\) For each \(i\text{,}\) let \(\mathbf{x}_i\) be a \(\lambda_i\)-eigenvector of \(A\text{.}\) Then \(\{ \mathbf{x}_1, \mathbf{x}_2, \ldots, \mathbf{x}_m\}\) is linearly independent.
Proof.
We prove this by induction on \(m\text{,}\) the number of vectors in the set. If \(m = 1\text{,}\) then \(\{\mathbf{x}_{1}\}\) is a linearly independent set because \(\mathbf{x}_{1} \neq \mathbf{0}\text{.}\) In general, suppose we have established that the theorem is true for some \(m \geq 1\text{.}\) Given eigenvectors \(\{\mathbf{x}_{1}, \mathbf{x}_{2}, \dots, \mathbf{x}_{m+1}\}\text{,}\) suppose
\begin{equation}
c_1\mathbf{x}_1 + c_2\mathbf{x}_2 + \dots + c_{m+1}\mathbf{x}_{m+1} = \mathbf{0}.\tag{8.2.4}
\end{equation}
We must show that each
\(c_{i} = 0\text{.}\) Multiply both sides of
(8.2.4) on the left by
\(A\) and use the fact that
\(A\mathbf{x}_{i} = \lambda_{i}\mathbf{x}_{i}\) to get
\begin{equation}
c_1\lambda_1\mathbf{x}_1 + c_2\lambda_2\mathbf{x}_2 + \dots + c_{m+1}\lambda_{m+1}\mathbf{x}_{m+1} = \mathbf{0}.\tag{8.2.5}
\end{equation}
If we multiply
(8.2.4) by
\(\lambda_{1}\) and subtract the result from
(8.2.5), the first terms cancel and we obtain
\begin{equation*}
c_2(\lambda_2 - \lambda_1)\mathbf{x}_2 + c_3(\lambda_3 - \lambda_1)\mathbf{x}_3 + \dots + c_{m+1}(\lambda_{m+1} - \lambda_1)\mathbf{x}_{m+1} = \mathbf{0}.
\end{equation*}
Since \(\mathbf{x}_{2}, \mathbf{x}_{3}, \dots, \mathbf{x}_{m+1}\) correspond to distinct eigenvalues \(\lambda_{2}, \lambda_{3}, \dots, \lambda_{m+1}\text{,}\) the set \(\{\mathbf{x}_{2}, \mathbf{x}_{3}, \dots, \mathbf{x}_{m+1}\}\) is linearly independent by the induction hypothesis. Hence,
\begin{equation*}
c_2(\lambda_2 - \lambda_1) = 0, \quad c_3(\lambda_3 - \lambda_1) = 0, \quad \dots, \quad c_{m+1}(\lambda_{m+1} - \lambda_1) = 0
\end{equation*}
and so
\(c_{2} = c_{3} = \dots = c_{m+1} = 0\) because the
\(\lambda_{i}\) are distinct. It follows that
(8.2.4) becomes
\(c_{1}\mathbf{x}_{1} = \mathbf{0}\text{,}\) which implies that
\(c_{1} = 0\) because
\(\mathbf{x}_{1} \neq \mathbf{0}\text{,}\) and the proof is complete.
The corollary that follows from this theorem gives a useful tool in determining if \(A\) is diagonalizable.
Corollary 8.2.17.
Let \(A\) be an \(n \times n\) matrix and suppose it has \(n\) distinct eigenvalues. Then it follows that \(A\) is diagonalizable.
Remark 8.2.18.
Note that
Corollary 8.2.17 is NOT an ``if and only if statement". This means that if
\(A\) has repeated eigenvalues it is still sometimes possible to diagonalize
\(A\text{,}\) as seen in
Example 8.2.13.
Definition 8.2.19.
If we are able to diagonalize \(A\text{,}\) say \(A=PDP^{-1}\text{,}\) we say that \(PDP^{-1}\) is an eigenvalue decomposition of \(A\text{.}\)
Not every matrix has an eigenvalue decomposition. Sometimes we cannot find an invertible matrix \(P\) such that \(P^{-1}AP=D\text{.}\) Consider the following example.
Example 8.2.20.
Let
\begin{equation*}
A =
\begin{bmatrix}
1 \amp 1 \\
0 \amp 1
\end{bmatrix}.
\end{equation*}
If possible, find an invertible matrix \(P\) and a diagonal matrix \(D\) so that \(P^{-1}AP=D\text{.}\)
Answer.
We see immediately (how?) that the eigenvalues of \(A\) are \(\lambda_1 =1\) and \(\lambda_2=1\text{.}\) To find \(P\text{,}\) the next step would be to find a basis for the corresponding eigenspace \(\mathcal{S}_1\text{.}\) We solve the equation \(\left( A - \lambda I \right) \mathbf{x} = \mathbf{0}\text{.}\) Writing this equation as an augmented matrix, we already have a matrix in row echelon form:
\begin{equation*}
\left[\begin{array}{cc|c}
0 \amp -1 \amp 0 \\
0 \amp 0 \amp 0
\end{array}\right]
\end{equation*}
We see that the eigenvectors in \(\mathcal{S}_1\) are of the form
\begin{equation*}
\begin{bmatrix}
t \\
0
\end{bmatrix}
=t\begin{bmatrix}
1 \\
0
\end{bmatrix},
\end{equation*}
so a basis for the eigenspace \(\mathcal{S}_1\) is given by \([1,0]\text{.}\) It is easy to see that we cannot form an invertible matrix \(P\text{,}\) because any two eigenvectors will be of the form \([t,0]\text{,}\) and so the second row of \(P\) would have a row of zeros, and \(P\) could not be invertible. Hence \(A\) cannot be diagonalized.
We saw earlier in
Corollary 8.2.17 that an
\(n \times n\) matrix with
\(n\) distinct eigenvalues is diagonalizable. It turns out that there are other useful diagonalizability tests.
Recall that the algebraic multiplicity of an eigenvalue \(\lambda\) is the number of times that it occurs as a root of the characteristic polynomial.
Definition 8.2.21.
The geometric multiplicity of an eigenvalue \(\lambda\) is the dimension of the corresponding eigenspace \(\mathcal{S}_\lambda\text{.}\)
Consider now the following lemma.
Lemma 8.2.22.
Let \(A\) be an \(n\times n\) matrix, and let \(\mathcal{S}_{\lambda_1}\) be the eigenspace corresponding to the eigenvalue \(\lambda_1\) which has algebraic multiplicity \(m\text{.}\) Then
\begin{equation*}
\mbox{dim}(\mathcal{S}_{\lambda_1})\leq m.
\end{equation*}
In other words, the geometric multiplicity of an eigenvalue is less than or equal to the algebraic multiplicity of that same eigenvalue.
Proof.
Let \(k\) be the geometric multiplicity of \(\lambda_1\text{,}\) i.e., \(k=\mbox{dim}(\mathcal{S}_{\lambda_1})\text{.}\) Suppose \(\left\{\mathbf{x}_1, \mathbf{x}_2, \ldots ,\mathbf{x}_k\right\}\) is a basis for the eigenspace \(\mathcal{S}_{\lambda_1}\text{.}\) Let \(P\) be any invertible matrix having \(\mathbf{x}_1, \mathbf{x}_2, \ldots ,\mathbf{x}_k\) as its first \(k\) columns, say
\begin{equation*}
P=\begin{bmatrix}
| \amp | \amp \amp | \amp | \amp \amp | \\
\mathbf{x}_1 \amp \mathbf{x}_2 \amp \cdots \amp \mathbf{x}_k \amp \mathbf{x}_{k+1} \amp \cdots \amp \mathbf{x}_n \\
| \amp | \amp \amp | \amp | \amp \amp |
\end{bmatrix}.
\end{equation*}
In block form we may write
\begin{equation*}
P=\begin{bmatrix}
B\amp C
\end{bmatrix} \quad \text{and} \quad P^{-1}=\begin{bmatrix}
D \\
E
\end{bmatrix},
\end{equation*}
where \(B\) is \(n \times k\text{,}\) \(C\) is \(n \times (n-k)\text{,}\) \(D\) is \(k \times n\text{,}\) and \(E\) is \((n-k) \times n\text{.}\) We observe
\begin{equation*}
I_n = P^{-1}P = \left[\begin{array}{c|c}
DB \amp DC \\ \hline
EB \amp EC
\end{array}\right]. \text{.}
\end{equation*}
This implies
\begin{equation*}
DB = I_k,\quad DC=O_{k\,\,n-k},\quad EB = O_{n-k\,\,k} \quad\text{ and }\quad EC = I_{n-k}.
\end{equation*}
Therefore,
\begin{align*}
P^{-1}AP \amp =\begin{bmatrix}
D \\
E
\end{bmatrix}
A
\begin{bmatrix}
B\amp C
\end{bmatrix} =
\left[\begin{array}{c|c}
DAB \amp DAC \\ \hline
EAB \amp EAC
\end{array}\right ] \\
\amp = \left[\begin{array}{c|c}
\lambda_1 DB \amp DAC \\ \hline
\lambda_1 EB \amp EAC
\end{array}\right]
= \left[\begin{array}{c|c}
\lambda_1 I_k \amp DAC \\ \hline
O \amp EAC
\end{array}\right].
\end{align*}
We finish the proof by comparing the characteristic polynomials on both sides of this equation, and making use of the fact that similar matrices have the same characteristic polynomials.
\begin{equation*}
\det(A-\lambda I) = \det(P^{-1}AP-\lambda I)=(\lambda_1 - \lambda)^k \det(EAC).
\end{equation*}
We see that the characteristic polynomial of \(A\) has \((\lambda_1 - \lambda)^k\) as a factor. This tells us that algebraic multiplicity of \(\lambda_1\) is at least \(k\text{,}\) proving the desired inequality.
This result tells us that if \(\lambda\) is an eigenvalue of \(A\text{,}\) then the number of linearly independent \(\lambda\)-eigenvectors is never more than the multiplicity of \(\lambda\text{.}\) We now use this fact to provide a useful diagonalizability condition.
Theorem 8.2.23.
Let \(A\) be an \(n \times n\) matrix \(A\text{.}\) Then \(A\) is diagonalizable if and only if for each eigenvalue \(\lambda\) of \(A\text{,}\) the algebraic multiplicity of \(\lambda\) is equal to the geometric multiplicity of \(\lambda\text{.}\)
Proof.
Suppose
\(A\) is diagonalizable and let
\(\lambda_1, \ldots, \lambda_t\) be the distinct eigenvalues of
\(A\text{,}\) with algebraic multiplicities
\(m_1, \ldots, m_t\text{,}\) respectively and geometric multiplicities
\(k_1, \ldots, k_t\text{,}\) respectively. Since
\(A\) is diagonalizable,
Theorem 8.2.12 implies that
\(k_1+\cdots+k_t=n\text{.}\) By applying
Lemma 8.2.22 \(t\) times, we have
\begin{equation*}
n = k_1+\cdots+k_t \le m_1+\cdots+m_t = n,
\end{equation*}
which is only possible if
\(k_i=m_i\) for
\(i=1,\ldots,t\text{.}\) Conversely, if the geometric multiplicity equals the algebraic multiplicity of each eigenvalue, then obtaining a basis for each eigenspace yields
\(n\) eigenvectors. Applying
Theorem 8.2.16, we know that these
\(n\) eigenvectors are linearly independent, so
Theorem 8.2.12 implies that
\(A\) is diagonalizable.