Matrix similarity

In linear algebra, two n-by-n matrices $A$ and $B$ are called similar if there exists an invertible n-by-n matrix $P$ such that $$B = P^{-1} A P .$$ Similar matrices represent the same linear map under two (possibly) different bases, with $P$ being the change of basis matrix.

A transformation $A ↦ P^{−1}AP$ is called a similarity transformation or conjugation of the matrix $A$. In the general linear group, similarity is therefore the same as conjugacy, and similar matrices are also called conjugate; however, in a given subgroup $H$ of the general linear group, the notion of conjugacy may be more restrictive than similarity, since it requires that $P$ be chosen to lie in $H$.

Motivating example
When defining a linear transformation, it can be the case that a change of basis can result in a simpler form of the same transformation. For example, the matrix representing a rotation in $R^{3}$ when the axis of rotation is not aligned with the coordinate axis can be complicated to compute. If the axis of rotation were aligned with the positive $z$-axis, then it would simply be $$S = \begin{bmatrix} \cos\theta & -\sin\theta & 0 \\ \sin\theta & \cos\theta & 0 \\ 0 &          0 & 1 \end{bmatrix},$$ where $$\theta$$ is the angle of rotation. In the new coordinate system, the transformation would be written as $$y' = Sx',$$ where $x'$ and $y'$ are respectively the original and transformed vectors in a new basis containing a vector parallel to the axis of rotation. In the original basis, the transform would be written as $$y = Tx,$$ where vectors $x$ and $y$ and the unknown transform matrix $T$ are in the original basis. To write $T$ in terms of the simpler matrix, we use the change-of-basis matrix $P$ that transforms $x$ and $y$ as $$x' = Px$$ and $$y' = Py$$: $$\begin{align} &           &  y' &= S x'  \\[1.6ex] &\Rightarrow & P y &= S P x \\[1.6ex] &\Rightarrow &  y &= \left(P^{-1} S P\right) x = T x \end{align}$$

Thus, the matrix in the original basis, $$T$$, is given by $$T = P^{-1}SP$$. The transform in the original basis is found to be the product of three easy-to-derive matrices. In effect, the similarity transform operates in three steps: change to a new basis ($P$), perform the simple transformation ($S$), and change back to the old basis ($P^{−1}$).

Properties
Similarity is an equivalence relation on the space of square matrices.

Because matrices are similar if and only if they represent the same linear operator with respect to (possibly) different bases, similar matrices share all properties of their shared underlying operator:


 * Rank
 * Characteristic polynomial, and attributes that can be derived from it:
 * Determinant
 * Trace
 * Eigenvalues, and their algebraic multiplicities
 * Geometric multiplicities of eigenvalues (but not the eigenspaces, which are transformed according to the base change matrix P used).
 * Minimal polynomial
 * Frobenius normal form
 * Jordan normal form, up to a permutation of the Jordan blocks
 * Index of nilpotence
 * Elementary divisors, which form a complete set of invariants for similarity of matrices over a principal ideal domain

Because of this, for a given matrix A, one is interested in finding a simple "normal form" B which is similar to A—the study of A then reduces to the study of the simpler matrix B. For example, A is called diagonalizable if it is similar to a diagonal matrix. Not all matrices are diagonalizable, but at least over the complex numbers (or any algebraically closed field), every matrix is similar to a matrix in Jordan form. Neither of these forms is unique (diagonal entries or Jordan blocks may be permuted) so they are not really normal forms; moreover their determination depends on being able to factor the minimal or characteristic polynomial of A (equivalently to find its eigenvalues). The rational canonical form does not have these drawbacks: it exists over any field, is truly unique, and it can be computed using only arithmetic operations in the field; A and B are similar if and only if they have the same rational canonical form. The rational canonical form is determined by the elementary divisors of A; these can be immediately read off from a matrix in Jordan form, but they can also be determined directly for any matrix by computing the Smith normal form, over the ring of polynomials, of the matrix (with polynomial entries) $XI_{n} − A$ (the same one whose determinant defines the characteristic polynomial). Note that this Smith normal form is not a normal form of A itself; moreover it is not similar to $XI_{n} − A$ either, but obtained from the latter by left and right multiplications by different invertible matrices (with polynomial entries).

Similarity of matrices does not depend on the base field: if L is a field containing K as a subfield, and A and B are two matrices over K, then A and B are similar as matrices over K if and only if they are similar as matrices over L. This is so because the rational canonical form over K is also the rational canonical form over L. This means that one may use Jordan forms that only exist over a larger field to determine whether the given matrices are similar.

In the definition of similarity, if the matrix P can be chosen to be a permutation matrix then A and B are permutation-similar; if P can be chosen to be a unitary matrix then A and B are unitarily equivalent. The spectral theorem says that every normal matrix is unitarily equivalent to some diagonal matrix. Specht's theorem states that two matrices are unitarily equivalent if and only if they satisfy certain trace equalities.

General references

 * (Similarity is discussed many places, starting at page 44.)