Divided differences

In mathematics, divided differences is an algorithm, historically used for computing tables of logarithms and trigonometric functions. Charles Babbage's difference engine, an early mechanical calculator, was designed to use this algorithm in its operation.

Divided differences is a recursive division process. Given a sequence of data points $$(x_0, y_0), \ldots, (x_{n}, y_{n})$$, the method calculates the coefficients of the interpolation polynomial of these points in the Newton form.

Definition
Given n + 1 data points $$(x_0, y_0),\ldots,(x_{n}, y_{n})$$ where the $$x_k$$ are assumed to be pairwise distinct, the forward divided differences are defined as: $$\begin{align} \mathopen[y_k] &:= y_k, && k \in \{ 0,\ldots,n\} \\ \mathopen[y_k,\ldots,y_{k+j}] &:= \frac{[y_{k+1},\ldots, y_{k+j}] - [y_{k},\ldots , y_{k+j-1}]}{x_{k+j}-x_k}, && k\in\{0,\ldots,n-j\},\ j\in\{1,\ldots,n\}. \end{align}$$

To make the recursive process of computation clearer, the divided differences can be put in tabular form, where the columns correspond to the value of j above, and each entry in the table is computed from the difference of the entries to its immediate lower left and to its immediate upper left, divided by a difference of corresponding x-values: $$ \begin{matrix} x_0 & y_0 = [y_0] &          &               & \\ &      & [y_0,y_1] &               & \\ x_1 & y_1 = [y_1] &          & [y_0,y_1,y_2] & \\ &      & [y_1,y_2] &               & [y_0,y_1,y_2,y_3]\\ x_2 & y_2 = [y_2] &          & [y_1,y_2,y_3] & \\ &      & [y_2,y_3] &               & \\ x_3 & y_3 = [y_3] &          &               & \\ \end{matrix} $$

Notation
Note that the divided difference $$[y_k,\ldots,y_{k+j}]$$ depends on the values $$x_k,\ldots,x_{k+j}$$ and $$y_k,\ldots,y_{k+j}$$, but the notation hides the dependency on the x-values. If the data points are given by a function f, $$(x_0, y_0), \ldots, (x_{k}, y_n) =(x_0, f(x_0)), \ldots, (x_n, f(x_n))$$ one sometimes writes the divided difference in the notation $$f[x_k,\ldots,x_{k+j}] \ \stackrel{\text{def}}= \ [f(x_k),\ldots,f(x_{k+j})] = [y_k,\ldots,y_{k+j}].$$Other notations for the divided difference of the function &fnof; on the nodes x0, ..., xn are: $$f[x_k,\ldots,x_{k+j}]=\mathopen[x_0,\ldots,x_n]f= \mathopen[x_0,\ldots,x_n;f]= D[x_0,\ldots,x_n]f.$$

Example
Divided differences for $$k=0$$ and the first few values of $$j$$: $$ \begin{align} \mathopen[y_0] &= y_0 \\ \mathopen[y_0,y_1] &= \frac{y_1-y_0}{x_1-x_0} \\ \mathopen[y_0,y_1,y_2] &= \frac{\mathopen[y_1,y_2]-\mathopen[y_0,y_1]}{x_2-x_0} = \frac{\frac{y_2-y_1}{x_2-x_1}-\frac{y_1-y_0}{x_1-x_0}}{x_2-x_0} = \frac{y_2-y_1}{(x_2-x_1)(x_2-x_0)}-\frac{y_1-y_0}{(x_1-x_0)(x_2-x_0)} \\ \mathopen[y_0,y_1,y_2,y_3] &= \frac{\mathopen[y_1,y_2,y_3]-\mathopen[y_0,y_1,y_2]}{x_3-x_0} \end{align} $$

Thus, the table corresponding to these terms upto two columns has the following form: $$\begin{matrix} x_0 & y_{0} &                                & \\ &       & {y_{1}-y_{0}\over x_1 - x_0}  & \\ x_1 & y_{1} &                                & {{y_{2}-y_{1}\over x_2 - x_1}-{y_{1}-y_{0}\over x_1 - x_0} \over x_2 - x_0} \\ &       & {y_{2}-y_{1}\over x_2 - x_1}  & \\ x_2 & y_{2} &                                & \vdots \\ &       & \vdots                          & \\ \vdots &       &                                 & \vdots \\ &       & \vdots                          & \\ x_n & y_{n} &                                & \\ \end{matrix} $$

Properties
(f+g)[x_0,\dots,x_n] &= f[x_0,\dots,x_n] + g[x_0,\dots,x_n] \\ (\lambda\cdot f)[x_0,\dots,x_n] &= \lambda\cdot f[x_0,\dots,x_n] \end{align}$$
 * Linearity $$\begin{align}
 * Leibniz rule $$(f\cdot g)[x_0,\dots,x_n] = f[x_0]\cdot g[x_0,\dots,x_n] + f[x_0,x_1]\cdot g[x_1,\dots,x_n] + \dots + f[x_0,\dots,x_n]\cdot g[x_n] = \sum_{r=0}^n f[x_0,\ldots,x_r]\cdot g[x_r,\ldots,x_n]$$
 * Divided differences are symmetric: If $$\sigma : \{0, \dots, n\} \to \{0, \dots, n\}$$ is a permutation then $$f[x_0, \dots, x_n] = f[x_{\sigma(0)}, \dots, x_{\sigma(n)}]$$
 * Polynomial interpolation in the Newton form: if $$P$$ is a polynomial function of degree $$\leq n$$, and $$p[x_0, \dots, x_n]$$ is the divided difference, then $$P_{n-1}(x) = p[x_0] + p[x_0,x_1](x-x_0) + p[x_0,x_1,x_2](x-x_0)(x-x_1) + \cdots + p[x_0,\ldots,x_n] (x-x_0)(x-x_1)\cdots(x-x_{n-1})$$
 * If $$p$$ is a polynomial function of degree $$<n$$, then $$p[x_0, \dots, x_n] = 0.$$
 * Mean value theorem for divided differences: if $$f$$ is n times differentiable, then $$f[x_0,\dots,x_n] = \frac{f^{(n)}(\xi)}{n!}$$ for a number $$\xi$$ in the open interval determined by the smallest and largest of the $$x_k$$'s.

Matrix form
The divided difference scheme can be put into an upper triangular matrix: $$T_f(x_0,\dots,x_n)= \begin{pmatrix} f[x_0] & f[x_0,x_1] & f[x_0,x_1,x_2] & \ldots & f[x_0,\dots,x_n] \\ 0     & f[x_1]     & f[x_1,x_2]     & \ldots & f[x_1,\dots,x_n] \\ 0     & 0          & f[x_2]         & \ldots & f[x_2,\dots,x_n] \\ \vdots & \vdots    &                & \ddots & \vdots           \\ 0     & 0          & 0              & \ldots & f[x_n] \end{pmatrix}.$$

Then it holds 1 &: t=\xi, \\ 0 &: \mbox{else}. \end{cases}$$ Obviously $$f\cdot \delta_\xi = f(\xi)\cdot \delta_\xi$$, thus $$\delta_\xi$$ is an eigenfunction of the pointwise function multiplication. That is $$T_{\delta_{x_i}}(x)$$ is somehow an "eigenmatrix" of $$T_f(x)$$: $$ T_f(x) \cdot T_{\delta_{x_i}} (x) = f(x_i) \cdot T_{\delta_{x_i}}(x) $$. However, all columns of $$T_{\delta_{x_i}}(x)$$ are multiples of each other, the matrix rank of $$T_{\delta_{x_i}}(x)$$ is 1. So you can compose the matrix of all eigenvectors of $$T_f(x)$$ from the $$i$$-th column of each $$T_{\delta_{x_i}}(x)$$. Denote the matrix of eigenvectors with $$U(x)$$. Example $$ U(x_0,x_1,x_2,x_3) = \begin{pmatrix} 1 & \frac{1}{(x_1-x_0)} & \frac{1}{(x_2-x_0) (x_2-x_1)} & \frac{1}{(x_3-x_0) (x_3-x_1) (x_3-x_2)} \\ 0 & 1 & \frac{1}{(x_2-x_1)} & \frac{1}{(x_3-x_1) (x_3-x_2)} \\ 0 & 0 & 1 & \frac{1}{(x_3-x_2)} \\ 0 & 0 & 0 & 1 \end{pmatrix} $$ The diagonalization of $$T_f(x)$$ can be written as $$ U(x) \cdot \operatorname{diag}(f(x_0),\dots,f(x_n)) = T_f(x) \cdot U(x) .$$
 * $$T_{f+g}(x) = T_f(x) + T_g(x)$$
 * $$T_{\lambda f}(x) = \lambda T_f(x)$$ if $$\lambda$$ is a scalar
 * $$T_{f\cdot g}(x) = T_f(x) \cdot T_g(x)$$ This follows from the Leibniz rule. It means that multiplication of such matrices is commutative. Summarised, the matrices of divided difference schemes with respect to the same set of nodes x form a commutative ring.
 * Since $$ T_f(x) $$ is a triangular matrix, its eigenvalues are obviously $$ f(x_0), \dots, f(x_n) $$.
 * Let $$\delta_\xi$$ be a Kronecker delta-like function, that is $$\delta_\xi(t) = \begin{cases}

Polynomials and power series
The matrix $$ J = \begin{pmatrix} x_0 & 1 & 0 & 0 & \cdots & 0 \\ 0 & x_1 & 1 & 0 & \cdots & 0 \\ 0 & 0 & x_2 & 1 &       & 0 \\ \vdots & \vdots & & \ddots & \ddots & \\ 0 & 0 & 0 & 0 &   \; \ddots     & 1\\ 0 & 0 & 0 & 0 &         & x_n \end{pmatrix} $$ contains the divided difference scheme for the identity function with respect to the nodes $$x_0,\dots,x_n$$, thus $$J^m$$ contains the divided differences for the power function with exponent $$m$$. Consequently, you can obtain the divided differences for a polynomial function $$p$$ by applying $$p$$ to the matrix $$J$$: If $$p(\xi) = a_0 + a_1 \cdot \xi + \dots + a_m \cdot \xi^m$$ and $$p(J) = a_0 + a_1\cdot J + \dots + a_m\cdot J^m$$ then $$T_p(x) = p(J).$$ This is known as Opitz' formula.

Now consider increasing the degree of $$p$$ to infinity, i.e. turn the Taylor polynomial into a Taylor series. Let $$f$$ be a function which corresponds to a power series. You can compute the divided difference scheme for $$f$$ by applying the corresponding matrix series to $$J$$: If $$f(\xi) = \sum_{k=0}^\infty a_k \xi^k$$ and $$f(J)=\sum_{k=0}^\infty a_k J^k$$ then $$T_f(x)=f(J).$$

Expanded form
$$ \begin{align} f[x_0] &= f(x_0) \\ f[x_0,x_1] &= \frac{f(x_0)}{(x_0-x_1)} + \frac{f(x_1)}{(x_1-x_0)} \\ f[x_0,x_1,x_2] &= \frac{f(x_0)}{(x_0-x_1)\cdot(x_0-x_2)} + \frac{f(x_1)}{(x_1-x_0)\cdot(x_1-x_2)} + \frac{f(x_2)}{(x_2-x_0)\cdot(x_2-x_1)} \\ f[x_0,x_1,x_2,x_3] &= \frac{f(x_0)}{(x_0-x_1)\cdot(x_0-x_2)\cdot(x_0-x_3)} + \frac{f(x_1)}{(x_1-x_0)\cdot(x_1-x_2)\cdot(x_1-x_3)} +\\ &\quad\quad \frac{f(x_2)}{(x_2-x_0)\cdot(x_2-x_1)\cdot(x_2-x_3)} + \frac{f(x_3)}{(x_3-x_0)\cdot(x_3-x_1)\cdot(x_3-x_2)} \\ f[x_0,\dots,x_n] &= \sum_{j=0}^{n} \frac{f(x_j)}{\prod_{k\in\{0,\dots,n\}\setminus\{j\}} (x_j-x_k)} \end{align} $$

With the help of the polynomial function $$\omega(\xi) = (\xi-x_0) \cdots (\xi-x_n)$$ this can be written as $$ f[x_0,\dots,x_n] = \sum_{j=0}^{n} \frac{f(x_j)}{\omega'(x_j)}. $$

Peano form
If $$x_0<x_1<\cdots<x_n$$ and $$n\geq 1$$, the divided differences can be expressed as $$f[x_0,\ldots,x_n] = \frac{1}{(n-1)!} \int_{x_0}^{x_n} f^{(n)}(t)\;B_{n-1}(t) \, dt$$ where $$f^{(n)}$$ is the $$n$$-th derivative of the function $$f$$ and $$B_{n-1}$$ is a certain B-spline of degree $$n-1$$ for the data points $$x_0,\dots,x_n$$, given by the formula $$B_{n-1}(t) = \sum_{k=0}^n \frac{(\max(0,x_k-t))^{n-1}}{\omega'(x_k)}$$

This is a consequence of the Peano kernel theorem; it is called the Peano form of the divided differences and $$B_{n-1}$$ is the Peano kernel for the divided differences, all named after Giuseppe Peano.

Forward and backward differences
When the data points are equidistantly distributed we get the special case called forward differences. They are easier to calculate than the more general divided differences.

Given n+1 data points $$(x_0, y_0), \ldots, (x_n, y_n)$$ with $$x_{k} = x_0 + k h,\ \text{ for } \ k=0,\ldots,n \text{ and fixed } h>0$$ the forward differences are defined as $$\begin{align} \Delta^{(0)} y_k &:= y_k,\qquad k=0,\ldots,n \\ \Delta^{(j)}y_k &:= \Delta^{(j-1)}y_{k+1} - \Delta^{(j-1)}y_k,\qquad k=0,\ldots,n-j,\ j=1,\dots,n. \end{align}$$whereas the backward differences are defined as: $$\begin{align} \nabla^{(0)} y_k &:= y_k,\qquad k=0,\ldots,n \\ \nabla^{(j)}y_k &:= \nabla^{(j-1)}y_{k} - \nabla^{(j-1)}y_{k-1},\qquad k=0,\ldots,n-j,\ j=1,\dots,n. \end{align}$$ Thus the forward difference table is written as:$$ \begin{matrix} y_0 &              &                   &                  \\ & \Delta y_0 &                  &                  \\ y_1 &              & \Delta^2 y_0 &                  \\ & \Delta y_1 &                  & \Delta^3 y_0\\ y_2 &              & \Delta^2 y_1 &                  \\ & \Delta y_2 &                  &                  \\ y_3 &              &                   &                  \\ \end{matrix} $$whereas the backwards difference table is written as:$$ \begin{matrix} y_0 &              &                   &                  \\ & \nabla y_1 &                  &                  \\ y_1 &              & \nabla^2 y_2 &                  \\ & \nabla y_2 &                  & \nabla^3 y_3\\ y_2 &              & \nabla^2 y_3 &                  \\ & \nabla y_3 &                  &                  \\ y_3 &              &                   &                  \\ \end{matrix} $$

The relationship between divided differences and forward differences is $$[y_j, y_{j+1}, \ldots, y_{j+k}] = \frac{1}{k!h^k}\Delta^{(k)}y_j,   $$whereas for backward differences:$$[{y}_{j}, y_{j-1},\ldots,{y}_{j-k}] = \frac{1}{k!h^k}\nabla^{(k)}y_j. $$