Matrix splitting

In the mathematical discipline of numerical linear algebra, a matrix splitting is an expression which represents a given matrix as a sum or difference of matrices. Many iterative methods (for example, for systems of differential equations) depend upon the direct solution of matrix equations involving matrices more general than tridiagonal matrices. These matrix equations can often be solved directly and efficiently when written as a matrix splitting. The technique was devised by Richard S. Varga in 1960.

Regular splittings
We seek to solve the matrix equation

where A is a given n × n non-singular matrix, and k is a given column vector with n components. We split the matrix A into

where B and C are n × n matrices. If, for an arbitrary n × n matrix M, M has nonnegative entries, we write M &ge; 0. If M has only positive entries, we write M &gt; 0. Similarly, if the matrix M1 &minus; M2 has nonnegative entries, we write M1 &ge; M2.

Definition: A = B &minus; C is a regular splitting of A if B&minus;1 &ge; 0 and C &ge; 0.

We assume that matrix equations of the form

where g is a given column vector, can be solved directly for the vector x. If ($$) represents a regular splitting of A, then the iterative method

where x(0) is an arbitrary vector, can be carried out. Equivalently, we write ($$) in the form

The matrix D = B&minus;1C has nonnegative entries if ($$) represents a regular splitting of A.

It can be shown that if A&minus;1 &gt; 0, then $$\rho (\mathbf D)$$ < 1, where $$\rho (\mathbf D)$$ represents the spectral radius of D, and thus D is a convergent matrix. As a consequence, the iterative method ($$) is necessarily convergent.

If, in addition, the splitting ($$) is chosen so that the matrix B is a diagonal matrix (with the diagonal entries all non-zero, since B must be invertible), then B can be inverted in linear time (see Time complexity).

Matrix iterative methods
Many iterative methods can be described as a matrix splitting. If the diagonal entries of the matrix A are all nonzero, and we express the matrix A as the matrix sum

where D is the diagonal part of A, and U and L are respectively strictly upper and lower triangular n × n matrices, then we have the following.

The Jacobi method can be represented in matrix form as a splitting

The Gauss–Seidel method can be represented in matrix form as a splitting

The method of successive over-relaxation can be represented in matrix form as a splitting

Regular splitting
In equation ($$), let

Let us apply the splitting ($$) which is used in the Jacobi method: we split A in such a way that B consists of all of the diagonal elements of A, and C consists of all of the off-diagonal elements of A, negated. (Of course this is not the only useful way to split a matrix into two matrices.) We have


 * $$\begin{align}

& \mathbf{A^{-1}} = \frac{1}{47} \begin{pmatrix} 18 & 13 & 16 \\ 11 & 21 & 15 \\ 13 & 12 & 22 \end{pmatrix}, \quad \mathbf{B^{-1}} = \begin{pmatrix} \frac{1}{6} & 0 & 0 \\[4pt] 0 & \frac{1}{4} & 0 \\[4pt] 0 & 0 & \frac{1}{5} \end{pmatrix}, \end{align}$$
 * $$\begin{align}

\mathbf{D} = \mathbf{B^{-1}C} = \begin{pmatrix} 0 & \frac{1}{3} & \frac{1}{2} \\[4pt] \frac{1}{4} & 0 & \frac{1}{2} \\[4pt] \frac{3}{5} & \frac{1}{5} & 0 \end{pmatrix}, \quad \mathbf{B^{-1}k} = \begin{pmatrix} \frac{5}{6} \\[4pt] -3 \\[4pt] 2 \end{pmatrix}. \end{align}$$ Since B&minus;1 &ge; 0 and C &ge; 0, the splitting ($$) is a regular splitting. Since A&minus;1 &gt; 0, the spectral radius $$\rho (\mathbf D)$$ < 1. (The approximate eigenvalues of D are $$\lambda_i \approx -0.4599820, -0.3397859, 0.7997679.$$)  Hence, the matrix D is convergent and the method ($$) necessarily converges for the problem ($$). Note that the diagonal elements of A are all greater than zero, the off-diagonal elements of A are all less than zero and A is strictly diagonally dominant.

The method ($$) applied to the problem ($$) then takes the form

The exact solution to equation ($$) is

The first few iterates for equation ($$) are listed in the table below, beginning with $x^{(0)} = (0.0, 0.0, 0.0)^{T}$. From the table one can see that the method is evidently converging to the solution ($$), albeit rather slowly.

Jacobi method
As stated above, the Jacobi method ($$) is the same as the specific regular splitting ($$) demonstrated above.

Gauss–Seidel method
Since the diagonal entries of the matrix A in problem ($$) are all nonzero, we can express the matrix A as the splitting ($$), where

We then have


 * $$\begin{align}

& \mathbf{(D-L)^{-1}} = \frac{1}{120} \begin{pmatrix} 20 & 0 & 0 \\ 5 & 30 & 0 \\ 13 & 6 & 24 \end{pmatrix}, \end{align}$$


 * $$\begin{align}

& \mathbf{(D-L)^{-1}U} = \frac{1}{120} \begin{pmatrix} 0 & 40 & 60 \\ 0 & 10 & 75 \\ 0 & 26 & 51 \end{pmatrix}, \quad \mathbf{(D-L)^{-1}k} = \frac{1}{120} \begin{pmatrix} 100 \\ -335 \\ 233 \end{pmatrix}. \end{align}$$

The Gauss–Seidel method ($$) applied to the problem ($$) takes the form

The first few iterates for equation ($$) are listed in the table below, beginning with $x^{(0)} = (0.0, 0.0, 0.0)^{T}$. From the table one can see that the method is evidently converging to the solution ($$), somewhat faster than the Jacobi method described above.

Successive over-relaxation method
Let ω = 1.1. Using the splitting ($$) of the matrix A in problem ($$) for the successive over-relaxation method, we have


 * $$\begin{align}

& \mathbf{(D-\omega L)^{-1}} = \frac{1}{12} \begin{pmatrix} 2 & 0 & 0 \\ 0.55 & 3 & 0 \\ 1.441 & 0.66 & 2.4 \end{pmatrix}, \end{align}$$


 * $$\begin{align}

& \mathbf{(D-\omega L)^{-1}[(1-\omega )D+\omega U]} = \frac{1}{12} \begin{pmatrix} -1.2 & 4.4 & 6.6 \\ -0.33 & 0.01 & 8.415 \\ -0.8646 & 2.9062 & 5.0073 \end{pmatrix}, \end{align}$$


 * $$\begin{align}

& \mathbf{\omega (D-\omega L)^{-1}k} = \frac{1}{12} \begin{pmatrix} 11 \\ -36.575 \\ 25.6135 \end{pmatrix}. \end{align}$$

The successive over-relaxation method ($$) applied to the problem ($$) takes the form

The first few iterates for equation ($$) are listed in the table below, beginning with $x^{(0)} = (0.0, 0.0, 0.0)^{T}$. From the table one can see that the method is evidently converging to the solution ($$), slightly faster than the Gauss–Seidel method described above.