Kolmogorov–Arnold representation theorem

In real analysis and approximation theory, the Kolmogorov–Arnold representation theorem (or superposition theorem) states that every multivariate continuous function $$f\colon[0,1]^n\to \R$$ can be represented as a superposition of the two-argument addition of continuous functions of one variable. It solved a more constrained form of Hilbert's thirteenth problem, so the original Hilbert's thirteenth problem is a corollary.

The works of Vladimir Arnold and Andrey Kolmogorov established that if f is a multivariate continuous function, then f can be written as a finite composition of continuous functions of a single variable and the binary operation of addition. More specifically,


 * $$ f(\mathbf x) = f(x_1,\ldots ,x_n) = \sum_{q=0}^{2n} \Phi_{q}\!\left(\sum_{p=1}^{n} \phi_{q,p}(x_{p})\right) $$.

where $$\phi_{q,p}\colon[0,1]\to \R$$ and $$\Phi_{q}\colon \R \to \R$$.

There are proofs with specific constructions.

In a sense, they showed that the only true multivariate function is the sum, since every other function can be written using univariate functions and summing.

History
The Kolmogorov–Arnold representation theorem is closely related to Hilbert's 13th problem. In his Paris lecture at the International Congress of Mathematicians in 1900, David Hilbert formulated 23 problems which in his opinion were important for the further development of mathematics. The 13th of these problems dealt with the solution of general equations of higher degrees. It is known that for algebraic equations of degree 4 the solution can be computed by formulae that only contain radicals and arithmetic operations. For higher orders, Galois theory shows us that the solutions of algebraic equations cannot be expressed in terms of basic algebraic operations. It follows from the so called Tschirnhaus transformation that the general algebraic equation
 * $$x^{n}+a_{n-1}x^{n-1}+\cdots +a_{0}=0$$

can be translated to the form $$ y^{n}+b_{n-4}y^{n-4}+\cdots +b_{1}y+1=0$$. The Tschirnhaus transformation is given by a formula containing only radicals and arithmetic operations and transforms. Therefore, the solution of an algebraic equation of degree $$n$$ can be represented as a superposition of functions of two variables if $$n<7$$ and as a superposition of functions of $$n-4$$ variables if $$n\geq 7$$. For $$n=7$$ the solution is a superposition of arithmetic operations, radicals, and the solution of the equation $$y^{7}+b_{3}y^{3}+b_{2}y^{2}+b_{1}y+1=0$$.

A further simplification with algebraic transformations seems to be impossible which led to Hilbert's conjecture that "A solution of the general equation of degree 7 cannot be represented as a superposition of continuous functions of two variables". This explains the relation of Hilbert's thirteenth problem to the representation of a higher-dimensional function as superposition of lower-dimensional functions. In this context, it has stimulated many studies in the theory of functions and other related problems by different authors.

In the field of machine learning, there have been various attempts to use neural networks modeled on the Kolmogorov–Arnold representation. In these works, the Kolmogorov–Arnold theorem plays a role analogous to that of the universal approximation theorem in the study of multilayer perceptrons.

Variants
A variant of Kolmogorov's theorem that reduces the number of outer functions $$\Phi_{q}$$ is due to George Lorentz. He showed in 1962 that the outer functions $$\Phi_{q}$$ can be replaced by a single function $$\Phi$$. More precisely, Lorentz proved the existence of functions $$\phi _{q,p}$$, $$q=0,1,\ldots, 2n$$, $$p=1,\ldots,n,$$ such that


 * $$ f(\mathbf x) = \sum_{q=0}^{2n} \Phi\!\left(\sum_{p=1}^{n} \phi_{q,p}(x_{p})\right)$$.

David Sprecher replaced the inner functions $$\phi_{q,p}$$ by one single inner function with an appropriate shift in its argument. He proved that there exist real values $$\eta, \lambda_1,\ldots,\lambda_n$$, a continuous function $$\Phi\colon \mathbb{R} \rightarrow \R$$, and a real increasing continuous function $$\phi\colon [0,1] \rightarrow [0,1]$$ with $$\phi \in \operatorname{Lip}(\ln 2/\ln (2N+2))$$, for $$N \geq n \geq 2$$, such that


 * $$ f(\mathbf x) = \sum_{q=0}^{2n} \Phi\!\left(\sum_{p=1}^{n} \lambda_p \phi(x_{p}+\eta q)+q \right)$$.

Phillip A. Ostrand generalized the Kolmogorov superposition theorem to compact metric spaces. For $$p=1,\ldots,m$$ let $$X_p$$ be compact metric spaces of finite dimension $$n_p$$ and let $$n = \sum_{p=1}^{m} n_p$$. Then there exists continuous functions $$\phi_{q,p}\colon X_p \rightarrow [0,1], q=0,\ldots,2n, p=1,\ldots,m$$ and continuous functions $$G_q\colon [0,1] \rightarrow \R, q=0,\ldots,2n$$ such that any continuous function $$f\colon X_1 \times \dots \times X_m \rightarrow \mathbb{R}$$ is representable in the form


 * $$ f(x_1,\ldots,x_m) = \sum_{q=0}^{2n} G_{q}\!\left(\sum_{p=1}^{m} \phi_{q,p}(x_{p})\right) $$.

Limitations
The theorem does not hold in general for complex multi-variate functions, as discussed here. Furthermore, the non-smoothness of the inner functions and their "wild behavior" has limited the practical use of the representation, although there is some debate on this.