A matrix is a rectangular array of numbers. The plural form of matrix is matrices (not matrixes). You have encountered matrices before in the context of augmented matrices and coefficient matrices associate with linear systems.
Consider the matrix
\begin{equation*}
M=\begin{bmatrix}
1 \amp 2 \amp 3 \amp 4 \\
5 \amp 2 \amp 8 \amp 7 \\
6 \amp -9 \amp 1 \amp 2
\end{bmatrix}.
\end{equation*}
The dimension of a matrix is defined as \(m\times n\) where \(m\) is the number of rows and \(n\) is the number of columns. The above matrix is a \(3\times 4\) matrix because there are three rows and four columns.
A column vector in \(\R^n\) is an \(n\times 1\) matrix. A row vector in \(\R^n\) is a \(1\times n\) matrix.
The individual entries in the matrix are identified according to their position. The \(( i, j)\)-entry of a matrix is the entry in the \(i^{th}\) row and \(j^{th}\) column. For example, in matrix \(M\) above, \(8\) is called the \((2,3)\)-entry because it is in the second row and the third column.
We denote the entry in the \(i^{th}\) row and the \(j^{th}\) column of matrix \(A\) by \(a_{ij}\text{,}\) and write \(A\) in terms of its entries as
\begin{equation*}
A= \begin{bmatrix} a_{ij} \end{bmatrix}=\begin{bmatrix}
a_{11} \amp a_{12}\amp \dots\amp a_{1j}\amp \dots\amp a_{1n}\\
a_{21}\amp a_{22} \amp \dots\amp a_{2j}\amp \dots \amp a_{2n}\\
\vdots \amp \vdots\amp \amp \vdots\amp \amp \vdots\\
a_{i1}\amp a_{i2}\amp \dots \amp a_{ij}\amp \dots \amp a_{in}\\
\vdots \amp \vdots\amp \amp \vdots\amp \amp \vdots\\
a_{m1}\amp a_{m2}\amp \dots \amp a_{mj}\amp \dots \amp a_{mn}
\end{bmatrix}.
\end{equation*}
Occasionally it will be convenient to talk about columns and rows of a matrix \(A\) as vectors. We will use the following notation:
\begin{equation*}
A=\begin{bmatrix}|\amp |\amp \amp |\\\mathbf{c}_1\amp \mathbf{c}_2 \amp \ldots \amp \mathbf{c}_n\\|\amp |\amp \amp |\end{bmatrix}\quad\text{or}\quad A=\begin{bmatrix}\mathbf{c}_1\amp \mathbf{c}_2 \amp \ldots \amp \mathbf{c}_n\end{bmatrix}
\end{equation*}
\begin{equation*}
A=\begin{bmatrix}
- \amp \mathbf{r}_1 \amp - \\ - \amp \mathbf{r}_2 \amp - \\ \amp \vdots \amp \\ - \amp \mathbf{r}_m \amp -
\end{bmatrix}\quad\text{or}\quad A=\begin{bmatrix}\mathbf{r}_1\\\mathbf{r}_2\\\vdots\\\mathbf{r}_m\end{bmatrix}.
\end{equation*}
A matrix is called a square matrix if it has the same number of rows and columns. If \(B=\begin{bmatrix}b_{ij}\end{bmatrix}\) is an \(n \times n\) square matrix, the entries of the form \(b_{ii}\) are said to lie on the main diagonal. For example, if
\begin{equation*}
B=\begin{bmatrix}1\amp 2\amp 3\\4\amp 5\amp 6\\7\amp 4\amp 9\end{bmatrix},
\end{equation*}
then the main diagonal consists of entries \(b_{11}=1\text{,}\) \(b_{22}=5\) and \(b_{33}=9\text{.}\)
There are various operations which are done on matrices of appropriate sizes. Matrices can be added to and subtracted from other matrices, multiplied by a scalar, and multiplied by other matrices. We will never divide a matrix by another matrix, but we will see later how multiplication by a matrix inverse (if an inverse exists) plays a similar role to division.
In doing arithmetic with matrices, we often define the action by what happens in terms of the entries (or components) of the matrices. Before looking at these operations in depth, consider a few general definitions.
Definition 3.1.1. The Zero Matrix.
The \(m\times n\) zero matrix is the \(m\times n\) matrix having every entry equal to zero. The zero matrix is denoted by \(O\text{.}\)
Definition 3.1.2. Equality of Matrices.
Let \(A=\begin{bmatrix} a_{ij}\end{bmatrix}\) and \(B=\begin{bmatrix} b_{ij}\end{bmatrix}\) be two \(m \times n\) matrices. Then \(A=B\) means that \(a_{ij}=b_{ij}\) for all \(1\leq i\leq m\) and \(1\leq j\leq n\text{.}\)
Subsection 3.1.1 Addition and Scalar Multiplication of Matrices
Given two matrices of the same dimensions, we can add them together by adding their corresponding entries.
Definition 3.1.3. Addition of Matrices.
Let \(A=\begin{bmatrix} a_{ij}\end{bmatrix} \) and \(B=\begin{bmatrix} b_{ij}\end{bmatrix}\) be two \(m\times n\) matrices. Then the sum of matrices \(A\) and \(B\text{,}\) denoted by \(A+B\text{,}\) is an \(m \times n\) matrix given by
\begin{equation*}
A+B=\begin{bmatrix}a_{ij}+b_{ij}\end{bmatrix}.
\end{equation*}
An example might help unravel the formal definition.
Example 3.1.4.
Find the sum of \(A\) and \(B\text{,}\) if possible.
\begin{equation*}
A = \begin{bmatrix}
1 \amp 2 \amp 3 \\
1 \amp 0 \amp 4
\end{bmatrix},
B = \begin{bmatrix}
5 \amp 2 \amp 3 \\
-6 \amp 2 \amp 1
\end{bmatrix}.
\end{equation*}
Answer.
Notice that both \(A\) and \(B\) are of size \(2 \times 3\text{.}\) Since \(A\) and \(B\) are of the same size, addition is possible.
\begin{align*}
A + B \amp = \begin{bmatrix}
1 \amp 2 \amp 3 \\
1 \amp 0 \amp 4
\end{bmatrix}
+
\begin{bmatrix}
5 \amp 2 \amp 3 \\
-6 \amp 2 \amp 1
\end{bmatrix} \\
\amp =
\begin{bmatrix}
1+5 \amp 2+2 \amp 3+3 \\
1+ -6 \amp 0+2 \amp 4+1
\end{bmatrix} \\
\amp =
\begin{bmatrix}
6 \amp 4 \amp 6 \\
-5 \amp 2 \amp 5
\end{bmatrix}.
\end{align*}
Going forward, whenever we write \(A+B\) it will be assumed that the two matrices are of equal size and addition is possible.
Theorem 3.1.5. Properties of Matrix Addition.
Let \(A,B\) and \(C\) be matrices (of the same size). Then the following properties hold.
Commutative Law of Addition:
\begin{equation*}
A+B=B+A
\end{equation*}
Associative Law of Addition:
\begin{equation*}
\left( A+B\right) +C=A+\left( B+C\right)
\end{equation*}
Additive Identity: There exists a zero matrix such that
\begin{equation*}
A+O=A
\end{equation*}
Additive Inverse: There exists a matrix, \(-A\text{,}\) such that
\begin{equation*}
A+\left( -A\right) =O
\end{equation*}
Proof.
We will prove Properties
Item 1 and
Item 4. The remaining properties are left as exercises.
[Proof of
Item 1:] The
\((i,j)\)-entry of
\(A+B\) is given by
\begin{equation*}
a_{ij}+b_{ij}.
\end{equation*}
The \((i,j)\)-entry of \(B+A\) is given by
\begin{equation*}
b_{ij}+a_{ij}.
\end{equation*}
Since \(a_{ij}+b_{ij}=b_{ij}+a_{ij}\text{,}\) for all \(i\text{,}\) \(j\text{,}\) we conclude that \(A+B=B+A\text{.}\)
[Proof of
Item 4:] Let
\(-A\) be defined by
\begin{equation*}
-A=\begin{bmatrix}-a_{ij}\end{bmatrix}.
\end{equation*}
Then \(A+(-A)=O\text{.}\)
When a matrix is multiplied by a scalar, the new matrix is obtained by multiplying every entry of the original matrix by the given scalar.
Definition 3.1.6. Scalar Multiplication of Matrices.
If \(A=\begin{bmatrix} a_{ij}\end{bmatrix} \) and \(k\) is a scalar, then \(kA=\begin{bmatrix} ka_{ij}\end{bmatrix}\text{.}\)
A hands down example is given below.
Example 3.1.7.
Find \(7A\) if
\begin{equation*}
A=\begin{bmatrix}
2 \amp 0 \\
1 \amp -4
\end{bmatrix}.
\end{equation*}
Answer.
By Definition
Definition 3.1.6, we multiply each entry of
\(A\) by
\(7\text{.}\) Therefore,
\begin{equation*}
7A =
7\begin{bmatrix}
2 \amp 0 \\
1 \amp -4
\end{bmatrix} =
\begin{bmatrix}
7(2) \amp 7(0) \\
7(1) \amp 7(-4)
\end{bmatrix} =
\begin{bmatrix}
14 \amp 0 \\
7 \amp -28
\end{bmatrix}
\end{equation*}
Theorem 3.1.8. Properties of Scalar Multiplication.
Let \(A, B\) be matrices (of the same size), and \(k, p\) be scalars. Then scalar multiplication has the following properties.
Distributive Law over Matrix Addition:
\begin{equation*}
k \left( A+B\right) =k A+ kB
\end{equation*}
Distributive Law over Scalar Addition:
\begin{equation*}
\left( k +p \right) A= k A+p A
\end{equation*}
Associative Law for Scalar Multiplication:
\begin{equation*}
k \left( p A\right) = \left( k p \right) A
\end{equation*}
Multiplication by \(1\text{:}\)
\begin{equation*}
1A=A
\end{equation*}
The proof of this theorem is similar to the proof of
Theorem 3.1.5 and is left as an exercise.
Subsection 3.1.3 Transpose of a Matrix
Another important operation on matrices is that of taking the transpose. For a matrix \(A\text{,}\) we denote the transpose of \(A\) by \(A^T\text{.}\) Before formally defining the transpose, we explore this operation on the following matrix.
\begin{equation*}
\begin{bmatrix}
1 \amp 4 \\
3 \amp 1 \\
2 \amp 6
\end{bmatrix}^{T}=
\begin{bmatrix}
1 \amp 3 \amp 2 \\
4 \amp 1 \amp 6
\end{bmatrix}
\end{equation*}
What happened? The first column became the first row and the second column became the second row. Thus the \(3\times 2\) matrix became a \(2\times 3\) matrix. The number \(4\) was in the first row and the second column and it ended up in the second row and first column.
The definition of the transpose is as follows.
Definition 3.1.15. The Transpose of a Matrix.
Let \(A=\begin{bmatrix} a _{ij}\end{bmatrix}\) be an \(m\times n\) matrix. Then the transpose of \(A\), denoted by \(A^{T}\text{,}\) is the \(n\times m\) matrix given by
\begin{equation*}
A^{T} = \begin{bmatrix} a _{ij}\end{bmatrix}^{T}= \begin{bmatrix} a_{ji} \end{bmatrix}
\end{equation*}
The \(( i, j)\)-entry of \(A\) becomes the \(( j,i)\)-entry of \(A^T\text{.}\)
Here is a short exercise to warm you up to the transpose matrix.
Problem 3.1.16.
Calculate \(A^T\) for the following matrix
\begin{equation*}
A =
\begin{bmatrix}
1 \amp 2 \amp -6 \\
3 \amp 5 \amp 4
\end{bmatrix}.
\end{equation*}
Answer.
\begin{equation*}
A^T =
\begin{bmatrix}
1 \amp 3 \\
2 \amp 5 \\
-6 \amp 4
\end{bmatrix}.
\end{equation*}
Note that \(A\) is a \(2 \times 3\) matrix, while \(A^T\) is a \(3 \times 2\) matrix. The columns of \(A\) are the rows of \(A^T\text{,}\) and the rows of \(A\) are the columns of \(A^T\text{.}\)
Theorem 3.1.17. Properties of the Transpose of a Matrix.
Let \(A\) be an \(m\times n\) matrix, \(B\) an \(n\times p\) matrix, and \(k\) a scalar. Then
\(\displaystyle \left(A^{T}\right)^{T} = A\)
\(\displaystyle \left( AB\right) ^{T}=B^{T}A^{T} \)
\(\displaystyle \left( A+ B\right) ^{T}=A^{T}+ B^{T}\)
\(\displaystyle \left(kA\right)^T=kA^T\)
We will prove
Item 2. The remaining properties are left as exercises.
Proof.
[Proof of
Item 2:] Note that
\(A\) and
\(B\) have compatible dimensions, so that
\(AB\) is defined and has dimensions
\(m\times p\text{.}\) Thus,
\((AB)^T\) has dimensions
\(p\times m\text{.}\) On the right side of the equality,
\(A^T\) has dimensions
\(n\times m\text{,}\) and
\(B^T\) has dimensions
\(p\times n\text{.}\) Therefore
\(B^TA^T\) is defined and has dimensions
\(p\times m\text{.}\)
Now we know that \((AB)^T\) and \(B^TA^T\) have the same dimensions.
To show that \((AB)^T=B^TA^T\) we need to show that their corresponding entries are equal. Recall that the \((i,j)\)-entry of \(AB\) is given by the dot product of the \(i^{th}\) row of \(A\) and the \(j^{th}\) column of \(B\text{.}\) The same dot product is also the \((j,i)\)-entry of \((AB)^T\text{.}\)
The \((j,i)\)-entry of \(B^TA^T\) is given by the dot product of the \(j^{th}\) row of \(B^T\) and the \(i^{th}\) column of \(A^T\text{.}\) But the \(j^{th}\) row of \(B^T\) is has the same entries as the \(j^{th}\) column of \(B\text{,}\) and the \(i^{th}\) column of \(A^T\) has the same entries as the \(i^{th}\) row of \(A\text{.}\) Therefore the \((j,i)\)-entry of \(B^TA^T\) is also equal to the \((i,j)\)-entry of \(AB\text{.}\)
Thus, the corresponding components of \((AB)^T\) are equal and we conclude that \((AB)^T=B^TA^T\text{.}\)
The transpose of a matrix is related to other important topics. Consider the following definition.
Definition 3.1.18. Symmetric and Skew Symmetric Matrices.
An \(n\times n\) matrix \(A\) is said to be symmetric if \(A=A^{T}.\) It is said to be skew symmetric if \(A=-A^{T}.\)
We will explore these definitions in the following examples.
Example 3.1.19.
Let
\begin{equation*}
A=
\begin{bmatrix}
2 \amp 1 \amp 3 \\
1 \amp 5 \amp -3 \\
3 \amp -3 \amp 7
\end{bmatrix}.
\end{equation*}
Show that \(A\) is symmetric.
Answer.
\begin{equation*}
A^{T} =
\begin{bmatrix}
2 \amp 1 \amp 3 \\
1 \amp 5 \amp -3 \\
3 \amp -3 \amp 7
\end{bmatrix}.
\end{equation*}
Hence, \(A = A^{T}\text{,}\) so \(A\) is symmetric.
Example 3.1.20.
Let
\begin{equation*}
A=
\begin{bmatrix}
0 \amp 1 \amp 3 \\
-1 \amp 0 \amp 2 \\
-3 \amp -2 \amp 0
\end{bmatrix}.
\end{equation*}
Show that \(A\) is skew symmetric.
Answer.
\begin{equation*}
A^{T} =
\begin{bmatrix}
0 \amp -1 \amp -3\\
1 \amp 0 \amp -2\\
3 \amp 2 \amp 0
\end{bmatrix}.
\end{equation*}
Each entry of
\(A^T\) is equal to
\(-1\) times the same entry of
\(A\text{.}\) Hence,
\(A^{T} = - A\) and so by
Definition 3.1.18,
\(A\) is skew symmetric.
A special case of a symmetric matrix is a diagonal matrix. A diagonal matrix is a square matrix whose entries outside of the main diagonal are all zero. The identity matrix \(I\) is a diagonal matrix. Here is another example.
\begin{equation*}
\begin{bmatrix}2\amp 0\amp 0\amp 0\\0\amp -3\amp 0\amp 0\\0\amp 0\amp 1\amp 0\\0\amp 0\amp 0\amp 4\end{bmatrix}.
\end{equation*}