Multiplicity of Eigenvalues
\(\mathbf{M}\) refers to a complex matrix of shape \(n \times n\) unless stated otherwise.
Algebraic Multiplicity
Every complex square matrix matrix will always have \(\displaystyle n\) eigenvalues1. However, some of these eigenvalues may be repeated. Let us take the simple example of a diagonal matrix:
\[ \begin{equation*} \mathbf{D} =\begin{bmatrix} 5 & & & & \\ & 5 & & & \\ & & 5 & & \\ & & & 7 & \\ & & & & 7 \end{bmatrix} \end{equation*} \]
\(\displaystyle \mathbf{D}\) has two distinct eigenvalues: \(\displaystyle 5\ \text{and } 7\). The eigenvalue \(\displaystyle 5\) is repeated thrice and the eigenvalue \(\displaystyle 7\) is repeated twice. We say that the algebraic multiplicity of the eigenvalue \(\displaystyle 5\) is three and that of \(\displaystyle 7\) is two. The reason this is called the algebraic multiplicity has to do with the characteristic polynomial, an algebraic structure:
\[ \begin{equation*} \begin{aligned} |x\mathbf{I} -\mathbf{D} | & =( x-5)^{3}( x-7)^{2} \end{aligned} \end{equation*} \]
The exponent of \(\displaystyle x-5\) is the algebraic multiplicity of \(\displaystyle 5\). In general, any complex matrix \(\displaystyle \mathbf{M}\) of shape \(n \times n\) will have \(\displaystyle n\) eigenvalues, not necessarily distinct. Let the number of distinct eigenvalues be \(\displaystyle k\). Let the \(\displaystyle i^{th}\) eigenvalue and its algebraic multiplicity be \(\displaystyle \lambda_{i}\) and \(\displaystyle \alpha_{i}\) respectively. Then, the characteristic polynomial of \(\displaystyle \mathbf{M}\) can be written as:
\[ \begin{equation*} |x\mathbf{I} -\mathbf{M} |=\prod _{i=1}^{k}( x-\lambda_{i})^{\alpha_{i}} \end{equation*} \]
Note that \(\alpha_1 + \cdots + \alpha_k = n\).
Geometric Multiplicity
\(\displaystyle \mathcal{N}(\mathbf{M} -\lambda\mathbf{I})\) is the eigenspace corresponding to \(\displaystyle \lambda\). The eigenspace is a subspace of \(\displaystyle \mathbb{R}^{n}\) that contains all the eigenvectors of \(\displaystyle \mathbf{M}\) corresponding to \(\displaystyle \lambda\) and the zero vector. For example, consider the diagonal matrix \(\displaystyle \mathbf{D}\) again:
\[ \begin{equation*} \mathbf{D} =\begin{bmatrix} 5 & & & & \\ & 5 & & & \\ & & 5 & & \\ & & & 7 & \\ & & & & 7 \end{bmatrix} \end{equation*} \]
If \(\displaystyle \mathbf{e}_{i} =\begin{bmatrix}0 & \cdots & 1\cdots & 0\end{bmatrix}^{T}\) is the \(\displaystyle i^{th}\) unit vector in the standard basis for \(\displaystyle \mathbb{R}^{5}\), then the eigenspaces corresponding to the eigenvalues \(\displaystyle 5\) and \(\displaystyle 7\) are:
\[ \begin{equation*} \text{span}(\{\mathbf{e}_{1} ,\mathbf{e}_{2} ,\mathbf{e}_{3}\}) ,\ \text{span}(\{\mathbf{e}_{4} ,\mathbf{e}_{5}\}) \end{equation*} \]
The geometric multiplicity of the two eigenvalues, \(\displaystyle 5\) and \(\displaystyle 7\), are the dimensions of these two spaces, which are \(\displaystyle 3\) and \(\displaystyle 2\) respectively. The geometric multiplicity need not be equal to the algebraic multiplicity. Take the case of this matrix:
\[ \begin{equation*} \mathbf{M} =\begin{bmatrix} 1 & 1\\ 0 & 1 \end{bmatrix} \end{equation*} \]
Its eigenvalues are:
\[ \begin{equation*} \begin{aligned} |x\mathbf{I} -\mathbf{M} | & =( x-1)^{2}\\ & =0\\ \implies x & =1,1 \end{aligned} \end{equation*} \]
The eigenvector can be obtained by solving \((\mathbf{M} - \mathbf{I}) \mathbf{x} = \mathbf{0}\):
\[ \begin{equation*} \begin{bmatrix} 0 & 1\\ 0 & 0 \end{bmatrix}\begin{bmatrix} x_{1}\\ x_{2} \end{bmatrix} =\begin{bmatrix} 0\\ 0 \end{bmatrix} \end{equation*} \]
\(\begin{bmatrix}1\\0\end{bmatrix}\) and its non-zero scalar multiplies are only eigenvectors. In other words, the eigenspace for this eigenvalue is \(\displaystyle \text{span}\left(\left\{\begin{bmatrix}1\\0\end{bmatrix}\right\}\right)\), which implies that the geometric multiplicity is \(\displaystyle 1\). The algebraic multiplicity on the other hand is equal to \(\displaystyle 2\). For this case, we see that the geometric multiplicity is less than the algebraic multiplicity. We will see a stronger version of this statement in the next section.
Inequality
Consider an eigenvalue \(\displaystyle \lambda\) of \(\displaystyle M\) with geometric multiplicity \(\displaystyle \beta\). This means that there is a linearly independent list of \(\displaystyle \beta\) eigenvectors with eigenvalue \(\displaystyle \lambda\). Let us enumerate this list as follows:
\[ \begin{equation*} \mathbf{v}_{1} ,\cdots ,\mathbf{v}_{\beta} \end{equation*} \]
We can extend this list to a basis of \(\displaystyle \mathbb{C}^{n}\) by adding vectors \(\displaystyle \mathbf{u}_{1} ,\cdots ,\mathbf{u}_{n-\beta}\). We will refer to this as the new basis:
\[ \begin{equation*} \mathbf{v}_{1} ,\cdots ,\mathbf{v}_{\beta} ,\mathbf{u}_{1} ,\cdots ,\mathbf{u}_{n-\beta} \end{equation*} \]
Note that \(\displaystyle \beta\leqslant n\). In the event of \(\displaystyle \beta=n\), we already have a basis and won’t need the services of the \(\displaystyle u_{i}\)s. Now that we have a basis, let us form the following matrix \(\displaystyle P\):
\[ \begin{equation*} \mathbf{P} =\begin{bmatrix} | & & | & | & & |\\ \mathbf{v}_{1} & \cdots & \mathbf{v}_{\beta} & \mathbf{u}_{1} & \cdots & \mathbf{u}_{n-\beta}\\ | & & | & | & & | \end{bmatrix} \end{equation*} \]
This is an \(\displaystyle n\times n\) invertible matrix, thanks to its linearly independent columns. Now that we have a new basis for \(\displaystyle \mathbb{C}^{n}\), what follows is a change of basis argument. We can choose to represent the transformation underlying \(\displaystyle \mathbf{A}\) in either the standard basis or this new basis. What connects the two matrix representations is a similarity transformation. But for now, let us continue to tread the path of matrix algebra. Consider the product \(\displaystyle \mathbf{AP}\):
\[ \begin{equation*} \begin{aligned} \mathbf{AP} & = \mathbf{A} \begin{bmatrix} | & & | & | & & |\\ \mathbf{v}_{1} & \cdots & \mathbf{v}_{\beta} & \mathbf{u}_{1} & \cdots & \mathbf{u}_{n-\beta}\\ | & & | & | & & | \end{bmatrix}\\ & \\ & =\begin{bmatrix} | & & | & | & & |\\ \mathbf{Av}_{1} & \cdots & \mathbf{Av}_{\beta} & \mathbf{Au}_{1} & \cdots & \mathbf{Au}_{n-\beta}\\ | & & | & | & & | \end{bmatrix}\\ & \\ & =\begin{bmatrix} | & & | & | & & |\\ \lambda\mathbf{v}_{1} & \cdots & \lambda\mathbf{v}_{\beta} & \mathbf{Au}_{1} & \cdots & \mathbf{Au}_{n-\beta}\\ | & & | & | & & | \end{bmatrix} \end{aligned} \end{equation*} \]
Note that \(\displaystyle \mathbf{Au}_{i} \in \mathbb{C}^{n}\), which can be represented as a linear combination of the columns of \(\displaystyle \mathbf{P}\), the new basis. This is trivially true in the case of \(\displaystyle \mathbf{Av}_{i}\). Introducing these coefficients into the mix, we can rewrite the RHS of the above equation as:
\[ \begin{equation*} \begin{aligned} \mathbf{AP} & =\begin{bmatrix} | & & | & | & & |\\ \mathbf{v}_{1} & \cdots & \mathbf{v}_{\beta} & \mathbf{u}_{1} & \cdots & \mathbf{u}_{n-\beta}\\ | & & | & | & & | \end{bmatrix}\begin{bmatrix} \lambda\mathbf{I} & \mathbf{X}\\ \mathbf{0} & \mathbf{Y} \end{bmatrix}\\ & \\ & =\mathbf{P}\begin{bmatrix} \lambda\mathbf{I} & \mathbf{X}\\ \mathbf{0} & \mathbf{Y} \end{bmatrix} \end{aligned} \end{equation*} \]
The dimensions of these matrices introduced above for more clarity:
\(\displaystyle \mathbf{I}\): \(\displaystyle \beta\times \beta\)
\(\displaystyle \mathbf{X}\): \(\displaystyle \beta\times ( n-\beta)\)
\(\displaystyle \mathbf{Y}\): \(\displaystyle ( n-\beta) \times ( n-\beta)\)
\(\displaystyle \mathbf{0}\): \(\displaystyle ( n-\beta) \times \beta\)
Verify that \(\displaystyle \begin{bmatrix} \mathbf{X}\\ \mathbf{Y} \end{bmatrix}\) is \(\displaystyle n\times ( n-\beta)\), a matrix with \(\displaystyle n-\beta\) columns, where the \(\displaystyle i^{th}\) column stores the coefficients of \(\displaystyle \mathbf{Au}_{i}\). Going back to the decomposition we had obtained:
\[ \begin{equation*} \begin{aligned} \mathbf{AP} & =\mathbf{P}\begin{bmatrix} \lambda\mathbf{I} & \mathbf{X}\\ \mathbf{0} & \mathbf{Y} \end{bmatrix}\\ & \\ \Longrightarrow \mathbf{A} & =\mathbf{P\ \begin{bmatrix} \lambda\mathbf{I} & \mathbf{X}\\ \mathbf{0} & \mathbf{Y} \end{bmatrix}} \mathbf{P}^{-1} \end{aligned} \end{equation*} \]
Note that \(\displaystyle \mathbf{P}\) can be interpreted as a change-of-basis matrix. Next, \(\displaystyle \mathbf{A}\) and \(\displaystyle \begin{bmatrix} \lambda\mathbf{I} & \mathbf{X}\\ \mathbf{0} & \mathbf{Y} \end{bmatrix}\) are similar. Similar matrices have a host of things in common. In this context, we are going to use the fact that they share the same characteristic polynomial:
\[ \begin{equation*} \begin{aligned} |x\mathbf{I} -\mathbf{A} | & =\Biggl| x \mathbf{I}-\mathbf{\begin{bmatrix} \lambda\mathbf{I} & \mathbf{X}\\ \mathbf{0} & \mathbf{Y} \end{bmatrix}}\Biggl|\\ & \\ & =\begin{vmatrix} ( x-\lambda)\mathbf{I} & \mathbf{X}\\ \mathbf{0} & x\mathbf{I} -\mathbf{Y} \end{vmatrix} \end{aligned} \end{equation*} \]
Repeatedly expanding the determinant on the RHS along the columns, we get:
\[ \begin{equation*} \begin{aligned} |x\mathbf{I} -\mathbf{A} | & =( x-\lambda)^{\beta} \end{aligned} |x\mathbf{I} -\mathbf{Y} | \end{equation*} \]
We see that the exponent of \(\displaystyle ( x-\lambda)\) is at least \(\displaystyle \beta\). Therefore, the algebraic multiplicity of \(\displaystyle \lambda\) is at least as high as its geometric multiplicity.
Diagonalizable Matrices
We have to show the implication in both directions.
Diagonalizable \(\implies\) equal multiplicities
We will present two proofs.
In this proof, the summation is over the \(\displaystyle k\) distinct eigenvalues of the matrix \(\displaystyle \mathbf{M}\). Let \(\displaystyle \alpha _{i}\) and \(\displaystyle \beta _{i}\) be the algebraic and geometric multiplicities of the eigenvalue \(\displaystyle \lambda_{i}\). We need to show that \(\displaystyle \alpha _{i} =\beta _{i}\) for \(\displaystyle 1\leqslant i\leqslant k\).
Let us begin with the inequality from the previous section: \(\displaystyle \beta _{i} \leqslant \alpha _{i}\). This gives us:
\[ \begin{equation*} \sum \beta _{i} \leqslant \sum \alpha _{i} \end{equation*} \]
Since \(\displaystyle \sum \alpha _{i} =n\) for a complex matrix:
\[ \begin{equation*} \sum \beta _{i} \leqslant n \end{equation*} \]
Since \(\displaystyle \mathbf{M}\) is diagonalizable, \(\displaystyle \mathbb{C}^{n}\) has a basis of eigenvectors of \(\displaystyle \mathbf{M}\). Each eigenvector in this basis is tied to an eigenvalue. Grouping the eigenvectors in the basis according to their eigenvalues and letting \(\displaystyle n_{i}\) be the number of eigenvectors associated with \(\displaystyle \lambda_{i}\), we have \(\displaystyle n_{i} \leqslant \beta _{i}\). This is because there could be at most \(\displaystyle \beta _{i}\) linearly independent eigenvectors in this basis corresponding to \(\displaystyle \lambda_{i}\). Summing across all unique eigenvalues:
\[ \begin{equation*} n\leqslant \sum \beta _{i} \end{equation*} \]
From (1) and (2), we conclude that \(\displaystyle \sum \beta _{i} =n=\sum \alpha _{i}\). From this, we get:
\[ \begin{equation*} \sum ( \alpha _{i} -\beta _{i}) =0 \end{equation*} \]
It follows that \(\displaystyle \alpha _{i} =\beta _{i}\) for \(\displaystyle 1\leqslant i\leqslant k\).
The crux of this proof is to recall that a linear map acts as a fundamental link which gives rise to different matrix representations that are pairwise similar.
Let \(\displaystyle T:\mathbb{C}^{n}\rightarrow \mathbb{C}^{n}\) be the linear map underlying the complex matrix \(\displaystyle \mathbf{M}\). Since \(\displaystyle \mathbf{M}\) is diagonalizable, there is a basis for \(\displaystyle \mathbb{C}^{n}\) made of eigenvectors of \(\displaystyle \mathbf{M}\). The matrix representation of \(\displaystyle T\) in this basis is diagonal; call this matrix \(\displaystyle \mathbf{D}\). Both \(\displaystyle \mathbf{M}\) and \(\displaystyle \mathbf{D}\) represent the same linear map \(\displaystyle T\) in two different bases. The eigenspace of \(\displaystyle T\) corresponding to \(\displaystyle \lambda\) is given by \(\displaystyle \left\{v:Tv=\lambda v,v\in \mathbb{C}^{n}\right\}\). Crucially, the eigenspace is basis agnostic and is only dependent on the linear map.
\(\displaystyle \lambda\) is an eigenvalue of \(\displaystyle T\) iff \(\displaystyle \lambda\) is an eigenvalue of \(\displaystyle \mathbf{M}\)/\(\displaystyle \mathbf{D}\).
\(\displaystyle \mathbf{M}\) and \(\displaystyle \mathbf{D}\) are similar and hence have the same characteristic polynomial which shows that \(\displaystyle \mathbf{M}\) and \(\displaystyle \mathbf{D}\) have the same set of eigenvalues with the same algebraic multiplicities.
If \(\displaystyle \lambda\) is an eigenvalue of \(\displaystyle \mathbf{D}\), its algebraic multiplicity is equal to its geometric multiplicity.
If \(\displaystyle \lambda\) is an eigenvalue of \(\displaystyle \mathbf{M}\)/\(\displaystyle \mathbf{D}\), its geometric multiplicity is equal to the dimension of the eigenspace of \(\displaystyle T\) corresponding to \(\displaystyle \lambda\).
It follows that the multiplicities of \(\displaystyle \mathbf{M}\) are equal.
Equal multiplicities \(\implies\) diagonalizable
As before, let us assume that there are \(\displaystyle k\) distinct eigenvalues: Let the multiplicities of \(\displaystyle \lambda_{i}\) be \(\alpha_i\) and \(\displaystyle \beta_{i}\) respectively:
\[ \begin{equation*} \sum _{i=1}^{k} \alpha_i = \sum _{i=1}^{k} \beta_{i} =n \end{equation*} \]
Each eigenvalue (can) has to contribute exactly \(\displaystyle \beta_{i}\) linearly independent eigenvectors. Let \(\displaystyle \mathbf{v}_{i}^{j}\) denote the \(\displaystyle j^{th}\) contribution from the \(\displaystyle i^{th}\) eigenvalue:
\[ \begin{equation*} \mathbf{v}_{i}^{1} ,\cdots ,\mathbf{v}_{i}^{j} ,\cdots ,\mathbf{v}_{i}^{\beta_{i}} \end{equation*} \]
Consider the following set of \(\displaystyle n\) eigenvectors:
\[ \begin{equation*} B=\left\{\begin{array}{ c c c } \mathbf{v}_{1}^{1} , & \cdots , & \mathbf{v}_{1}^{\beta_{1}} ,\\ \mathbf{v}_{2}^{1} , & \cdots , & \mathbf{v}_{2}^{\beta_{2}} ,\\ & \vdots & \\ \mathbf{v}{_{k}^{1}} , & \cdots , & \mathbf{v}_{k}^{\beta_{k}} \end{array}\right\} \end{equation*} \]
It is clear that each row is a linearly independent collection by construction. It remains to see if the entire set is linearly independent. Let us consider any linear combination of these \(\displaystyle n\) vectors and set it to zero:
\[ \begin{equation*} \sum _{i=1}^{k}\sum _{j=1}^{\beta_{i}} \gamma _{i}^{j}\mathbf{v}_{i}^{j} =\mathbf{0} \end{equation*} \]
Let us define the inner sum to be:
\[ \begin{equation*} \mathbf{w}_{i} =\sum _{j=1}^{\beta_{i}} \gamma _{i}^{j}\mathbf{v}_{i}^{j} \end{equation*} \]
If \(\displaystyle \mathbf{w}_{i} \neq 0\), then it is an eigenvector for eigenvalue \(\displaystyle \lambda_{i}\). The entire sum can now be written as:
\[ \begin{equation*} \sum _{i=1}^{k}\mathbf{w}_{i} =\mathbf{0} \end{equation*} \]
Each element in the set \(\displaystyle \{\mathbf{w}_{1} ,\cdots ,\mathbf{w}_{k}\}\) is either a zero-vector or an eigenvector. If every vector is a zero-vector, then we are done. If not, let us throw away the zero vectors and only consider the subset of eigenvectors. This subset is a collection of eigenvectors corresponding to distinct eigenvalues and hence linearly independent. The sum of these vectors cannot be zero. Therefore, every vector \(\displaystyle \mathbf{w}_{i} =\mathbf{0}\), \(\displaystyle 1\leqslant i\leqslant k\). This means:
\[ \begin{equation*} \sum _{j=1}^{\beta_{i}} \gamma _{i}^{j}\mathbf{v}_{i}^{j} =\mathbf{0} ,\ 1\leqslant i\leqslant k \end{equation*} \]
We already know that the set of vectors \(\displaystyle \left\{v_{i}^{1} ,\cdots ,v_{i}^{\beta_{i}}\right\}\) is linearly independent. Hence \(\displaystyle \alpha _{i}^{j} =0,\ 1\leqslant i\leqslant k,\ 1\leqslant j\leqslant \beta_{i}\), showing that \(\displaystyle B\) is a linearly independent collection of \(\displaystyle n\) eigenvectors of \(\displaystyle \mathbf{M}\). It follows that \(\displaystyle \mathbf{M}\) is diagonalizable.
Real matrices
So far we have dealt with matrices over the complex field. One advantage of working with the complex field is that any matrix has a complete set of eigenvalues. Unfortunately, real matrices do not enjoy this luxury. So some of the results in this document would have to be altered. The rest of this section treats \(\displaystyle \mathbf{M}\) as a matrix over the field of real numbers.
The inequality still holds. Algebraic multiplicity is at least as high as the geometric multiplicity.
If \(\displaystyle \mathbf{M}\) is diagonalizable then the algebraic multiplicities sum to \(\displaystyle n\), that is \(\displaystyle \sum \alpha _{i} =n\). This is just a different way of stating the fact that a diagonalizable matrix has a full set of eigenvalues. Thus having a complete set of eigenvalues becomes a necessary condition for diagonalizability. However, it is not sufficient, that is, the converse is not true. Refer to the example given in the section on geometric multiplicity.
\(\displaystyle \mathbf{M}\) is diagonalizable if and only if the algebraic and geometric multiplicities are equal and the algebraic multiplicities sum to \(\displaystyle n\). It is not enough for the algebraic and geometric multiplicities to be equal. Here is counter-example:
\[\begin{equation*} \mathbf{M} =\begin{bmatrix} 0 & 0 & -1\\ 1 & 0 & -1\\ 1 & -1 & 0 \end{bmatrix} \end{equation*}\]
\[\begin{equation*} \begin{aligned} |x\mathbf{I}-\mathbf{M}| & =\begin{vmatrix} x & 0 & 1\\ -1 & x & 1\\ -1 & 1 & x \end{vmatrix}\\ & =x\left( x^{2} -1\right) +( -1+x)\\ & =x^{3} -1 \end{aligned} \end{equation*}\]
geometric multiplicity. But \(\displaystyle \mathbf{M}\) is clearly not diagonalizable since it
doesn’t have a full set of eigenvalues.
Footnotes
Refer to Fundamental theorem of algebra↩︎