\(A^TA\) and \(AA^T\)

linear algebra
definite matrix
linear operator
eigenvalue
eigenvector
matrix factorization
svd
eigendecomposition
mlf
explainer
book
review

If \(\displaystyle \mathbf{A}\) is a \(\displaystyle m\times n\) matrix, then \(\displaystyle \mathbf{A}^{T}\mathbf{A}\) is a \(\displaystyle n\times n\) matrix and \(\displaystyle \mathbf{AA}^{T}\) is a \(\displaystyle m\times m\) matrix.

Property-1

\(\mathbf{A}^{T}\mathbf{A}\) and \(\displaystyle \mathbf{AA}^{T}\) are symmetric matrices.

Let us show this for \(\displaystyle \mathbf{A}^{T}\mathbf{A}\):

\[ \begin{equation*} (\mathbf{A}^{T}\mathbf{A} )^{T} =\mathbf{A}^{T} (\mathbf{A}^{T} )^{T} =\mathbf{A}^{T}\mathbf{A} \end{equation*} \]

The argument is similar for \(\displaystyle \mathbf{AA}^{T}\).

Property-2

The nullspace of \(\displaystyle \mathbf{A}\) is equal to the nullspace of \(\displaystyle \mathbf{A}^{T}\mathbf{A}\). Likewise, the nullspace of \(\displaystyle \mathbf{A}^{T}\) is equal to the nullspace of \(\displaystyle \mathbf{AA}^{T}\).

Let us denote the nullspaces as \(\displaystyle \mathcal{N}(\mathbf{A})\) and \(\displaystyle \mathcal{N}\left(\mathbf{A}^{T}\mathbf{A}\right)\).

If \(\displaystyle \mathbf{x} \in \mathcal{N}(\mathbf{A})\), then we have the following sequence of observations:

\[ \begin{equation*} \begin{aligned} \mathbf{Ax} & =\mathbf{0}\\ \mathbf{A}^{T}\mathbf{Ax} & =\mathbf{0}\\ \left(\mathbf{A}^{T}\mathbf{A}\right)\mathbf{x} & =\mathbf{0} \end{aligned} \end{equation*} \]

Thus, \(\displaystyle \mathbf{x} \in \mathcal{N}\left(\mathbf{A}^{T}\mathbf{A}\right)\) and therefore \(\displaystyle \mathcal{N}(\mathbf{A}) \subseteq \mathcal{N}\left(\mathbf{A}^{T}\mathbf{A}\right)\).

If \(\displaystyle \mathbf{x} \in \mathcal{N}\left(\mathbf{A}^{T}\mathbf{A}\right)\), then we have the following sequence of observations:

\[ \begin{equation*} \begin{aligned} \left(\mathbf{A}^{T}\mathbf{A}\right)\mathbf{x} & =\mathbf{0}\\ \mathbf{x}^{T}\mathbf{A}^{T}\mathbf{Ax} & =0\\ (\mathbf{Ax})^{T}(\mathbf{Ax}) & =0\\ \mathbf{Ax} & =\mathbf{0} \end{aligned} \end{equation*} \]

Thus, \(\displaystyle \mathbf{x} \in \mathcal{N}(\mathbf{A})\) and therefore \(\displaystyle \mathcal{N}\left(\mathbf{A}^{T}\mathbf{A}\right) \subseteq \mathcal{N}(\mathbf{A})\).

From this, we conclude that \(\displaystyle \mathcal{N}(\mathbf{A}) =\mathcal{N}\left(\mathbf{A}^{T}\mathbf{A}\right)\). A similar argument can be used for \(\displaystyle \mathcal{N}\left(\mathbf{A}^{T}\right) =\mathcal{N}\left(\mathbf{AA}^{T}\right)\).

Property-3

\(\displaystyle \text{rank}\left(\mathbf{A}^{T}\mathbf{A}\right) =\text{rank}\left(\mathbf{AA}^{T}\right) =\text{rank}(\mathbf{A})\)

Using the rank nullity theorem, we see the following:

\[ \begin{equation*} \begin{aligned} \text{rank}(\mathbf{A}) +\text{nullity}(\mathbf{A}) & =n\\ \text{rank}\left(\mathbf{A}^{T}\mathbf{A}\right) +\text{nullity}\left(\mathbf{A}^{T}\mathbf{A}\right) & =n \end{aligned} \end{equation*} \]

Using property-2 (refer to Section 2), we can conclude that the rank of \(\displaystyle \mathbf{A}^{T}\mathbf{A}\) is the same as the rank of \(\displaystyle \mathbf{A}\). Next, we know that the row rank is the same as the column rank, or \(\displaystyle \text{rank}(\mathbf{A}) =\text{rank}\left(\mathbf{A}^{T}\right)\). Again applying the rank nullity theorem:

\[ \begin{equation*} \begin{aligned} \text{rank}\left(\mathbf{A}^{T}\right) +\text{nullity}\left(\mathbf{A}^{T}\right) & =m\\ \text{rank}\left(\mathbf{AA}^{T}\right) +\text{nullity}\left(\mathbf{AA}^{T}\right) & =m \end{aligned} \end{equation*} \]

We can now conclude that \(\displaystyle \text{rank}\left(\mathbf{A}^{T}\mathbf{A}\right) =\text{rank}\left(\mathbf{AA}^{T}\right) =\text{rank}(\mathbf{A})\).

Property-4

If \(\text{rank} (\mathbf{A} )=n\), then \(\mathbf{A}^{T}\mathbf{A}\) is invertible.

From the previous property, we know that \(\text{rank}(\mathbf{A}^T\mathbf{A}) = \text{rank}(A)\). Since \(\mathbf{A}^T\mathbf{A}\) is full rank, it is invertible.

Property-5

All the eigenvalues of \(\displaystyle \mathbf{A}^{T}\mathbf{A}\) and \(\displaystyle \mathbf{AA}^{T}\) are non-negative.

Let \(\displaystyle ( \lambda ,\mathbf{u})\) be an eigenpair of \(\displaystyle \mathbf{AA}^{T}\). Then:

\[ \begin{equation*} \begin{aligned} \left(\mathbf{AA}^{T}\right)\mathbf{u} & =\lambda \mathbf{u}\\ \mathbf{u}^{T}\mathbf{AA}^{T}\mathbf{u} & =\lambda \mathbf{u}^{T}\mathbf{u}\\ \left(\mathbf{A}^{T}\mathbf{u}\right)^{T}\left(\mathbf{A}^{T}\mathbf{u}\right) & =\lambda ||\mathbf{u} ||^{2}\\ ||\mathbf{A}^{T}\mathbf{u} ||^{2} & =\lambda ||\mathbf{u} ||^{2} \end{aligned} \end{equation*} \]

Since \(\displaystyle \mathbf{u}\) is an eigenvector of \(\displaystyle \mathbf{AA}^{T}\), it is not the zero vector, so \(\displaystyle ||\mathbf{u}||\neq 0\). We can now express \(\displaystyle \lambda\) as:

\[ \begin{equation*} \lambda =\cfrac{||\mathbf{A}^{T}\mathbf{u} ||^{2}}{||\mathbf{u} ||^{2}} \geqslant 0 \end{equation*} \]

A similar argument holds for \(\displaystyle \mathbf{A}^{T}\mathbf{A}\).

Property-6

\(\displaystyle \mathbf{A}^{T}\mathbf{A}\) and \(\displaystyle \mathbf{AA}^{T}\) have the same non-zero eigenvalues.

Let \(\displaystyle ( \lambda ,\mathbf{v})\) be an eigenpair of \(\displaystyle \mathbf{A}^{T}\mathbf{A}\) with \(\displaystyle \lambda \neq 0\).

\[ \begin{equation*} \begin{aligned} \left(\mathbf{A}^{T}\mathbf{A}\right)\mathbf{v} & =\lambda \mathbf{v}\\ \mathbf{A}\left(\mathbf{A}^{T}\mathbf{A}\right)\mathbf{v} & =\lambda (\mathbf{Av})\\ \left(\mathbf{AA}^{T}\right)(\mathbf{Av}) & =\lambda (\mathbf{Av}) \end{aligned} \end{equation*} \]

\(\displaystyle ( \lambda ,\mathbf{Av})\) is an eigenpair of \(\displaystyle \mathbf{AA}^{T}\) provided \(\displaystyle \mathbf{Av} \neq \mathbf{0}\) as an eigenvector cannot be the zero-vector. We need to show that \(\displaystyle \mathbf{Av} \neq \mathbf{0}\). If \(\displaystyle \mathbf{Av}\) happens to be \(\displaystyle \mathbf{0}\), then \(\displaystyle \left(\mathbf{A}^{T}\mathbf{A}\right)\mathbf{v}\) will also become \(\displaystyle \mathbf{0}\). But this would mean that \(\displaystyle \mathbf{v}\) is an eigenvector of \(\displaystyle \mathbf{A}^{T}\mathbf{A}\) with eigenvalue \(\displaystyle 0\), implying that \(\displaystyle \lambda =0\). This contradicts the fact that \(\displaystyle \lambda \neq 0\). So, \(\displaystyle \mathbf{Av} \neq \mathbf{0}\) and hence is an eigenvector of \(\displaystyle \mathbf{AA}^{T}\).

We have shown that every non-zero eigenvalue of \(\displaystyle \mathbf{A}^{T}\mathbf{A}\) is also an eigenvalue of \(\displaystyle \mathbf{AA}^{T}\). We can now go the other way and show that every non-zero eigenvalue of \(\displaystyle \mathbf{AA}^{T}\) is also an eigenvalue of \(\displaystyle \mathbf{A}^{T}\mathbf{A}\).

Property-7

This is a result that lets us move between the eigenvectors of \(\displaystyle \mathbf{A}^{T}\mathbf{A}\) and \(\displaystyle \mathbf{AA}^{T}\) for the positive eigenvalues. There are two similar looking results:

  1. If \(\displaystyle ( \lambda ,\mathbf{v})\) is an eigenpair for \(\displaystyle \mathbf{A}^{T}\mathbf{A}\) with \(\displaystyle \lambda >0\) and \(\displaystyle ||\mathbf{v} ||=1\), then \(\displaystyle ( \lambda ,\mathbf{u})\) is an eigenpair for \(\displaystyle \mathbf{AA}^{T}\) where \(\displaystyle \mathbf{u} =\cfrac{\mathbf{Av}}{\sigma }\) with \(\displaystyle \sigma =\sqrt{\lambda }\), so that \(\displaystyle ||\mathbf{u} ||=1\).

  2. If \(\displaystyle ( \lambda ,\mathbf{u})\) is an eigenpair for \(\displaystyle \mathbf{A A}^{T}\) with \(\displaystyle \lambda >0\) and \(\displaystyle ||\mathbf{u} ||=1\), then \(\displaystyle ( \lambda ,\mathbf{v})\) is an eigenpair for \(\displaystyle \mathbf{A}^{T}\mathbf{A}\) where \(\displaystyle \mathbf{v} =\cfrac{\mathbf{A^{T} u}}{\sigma }\) with \(\displaystyle \sigma =\sqrt{\lambda }\), so that \(\displaystyle ||\mathbf{v} ||=1\).

Let us prove this for (2):

\[ \begin{equation*} \begin{aligned} \left(\mathbf{AA}^{T}\right)\mathbf{u} & =\lambda \mathbf{u}\\ \mathbf{A}^{T}\left(\mathbf{AA}^{T}\right)\mathbf{u} & =\lambda \mathbf{A}^{T}\mathbf{u}\\ \left(\mathbf{A}^{T}\mathbf{A}\right)\left(\mathbf{A}^{T}\mathbf{u}\right) & =\lambda \left(\mathbf{A}^{T}\mathbf{u}\right) \end{aligned} \end{equation*} \]

From property-6 (refer to Section 6) we already know that \(\displaystyle \left( \lambda ,\mathbf{A}^{T}\mathbf{u}\right)\) is an eigenpair of \(\displaystyle \mathbf{A}^{T}\mathbf{A}\). We can normalize \(\displaystyle \mathbf{A}^{T}\mathbf{u}\) by dividing it by \(\displaystyle ||\mathbf{A}^{T}\mathbf{u} ||\), which we shall compute now:

\[ \begin{equation*} \begin{aligned} ||\mathbf{A}^{T}\mathbf{u} ||^{2} & =\left(\mathbf{A}^{T}\mathbf{u}\right)^{T}\left(\mathbf{A}^{T}\mathbf{u}\right)\\ & =\mathbf{u}^{T}\mathbf{AA}^{T}\mathbf{u}\\ & =\mathbf{u}^{T}\left[\left(\mathbf{AA}^{T}\right)\mathbf{u}\right]\\ & =\mathbf{u}^{T}( \lambda \mathbf{u})\\ & =\lambda \\ \Longrightarrow ||\mathbf{A}^{T}\mathbf{u} || & =\sqrt{\lambda }\\ & =\sigma \end{aligned} \end{equation*} \]

Property-8

\(\displaystyle \mathbf{AA}^{T}\) and \(\displaystyle \mathbf{A}^{T}\mathbf{A}\) have exactly \(\displaystyle r\) positive eigenvalues, where \(\displaystyle r\) is the rank of \(\displaystyle \mathbf{A}\).

This proof requires an understanding of the concept of multiplicity of eigenvalues. Let us take the case of \(\displaystyle \mathbf{A}^{T}\mathbf{A}\). Its rank is \(\displaystyle r\) and its nullity is \(\displaystyle n-r\). The nullity is the dimension of the eigenspace corresponding to the eigenvalue \(\displaystyle 0\). In other terms, the geometric multiplicity of the eigenvalue \(\displaystyle 0\) is \(\displaystyle n-r\). Since \(\displaystyle \mathbf{A}^{T}\mathbf{A}\) is symmetric, it is diagonalizable, a result we know from the spectral theorem. For a diagonalizable matrix, the geometric multiplicity is equal to the algebraic multiplicity for every eigenvalue. So the eigenvalue \(\displaystyle 0\) will occur exactly \(\displaystyle n-r\) times. This means that the remaining \(\displaystyle r\) eigenvalues, not necessarily distinct, should all be positive.