User:Brbauwens/sandbox

We now use Young's inequality, which states that

Variants of Kolmogorov complexity

The standard definition of Kolmogorov complexity considers Turing machines mapping strings to strings. When dealing with infinite random sequences or probabilistic oracles, machines are needed that during the computation request more input and/or gradually append output. Changing the type of Turing machines, one obtains variants of Kolmogorov complexity. Many quantitative results such as the Coding Theorem and the chain rule for Kolmogorov complexity can be tightened with appropriate notions of Kolmogorov complexity.

General definition of Kolmogorov complexity
A machine U that associates several outputs to a program, no longer defines a partial function but a relation U on strings interpreted as programs and outputs. Given a machine U of some type X, the corresponding Kolmogorov complexity KX(x) of a bitstring x is given by the smallest length of an input on U that has an associated output x:    $$K_X(x) = \min \{|p|: (p,x) \in U \}$$. For all common types of machines X, there exist machines U for which KX(x) is minimal up to a constant: for any other machine there exist a number c such that its complexity K'(x) satisfies KX ≤ K'X(x) + c. In our definition of KX(x) we assume such an optimal machine.

We compare several restrictions on sets U, each restriction defines a machine of some type.

Plain complexity C(x)
Plain Turing machines define the type of Kolmogorov complexity mostly used in Theoretical computer science and was introduced by Andrey Kolmogorov. In this definition the relation U defines a function. Thus if $$(p_1,x_1) \in U$$ and $$(p_2,x_2) \in U$$ then $$ p_1 = p_2 \;\; \Rightarrow \;\; x_1 = x_2 $$.

The machine model is given by Turing machines with a 3-symbol tape (blanc-zero-one). At the start of the computation the tape contains a binary program and all other cells are blanc. When the computation unit of the Turing machines obtains a halting state, a unique output is associated to the program. Such output is always a (finite) string.

Prefix complexity K(x)
There exist two types of machines that define the same notion of complexity: prefix-free machines and prefix-stable machines. Prefix-free machines

In algorithmic information theory (a subfield of computer science), the Kolmogorov complexity (also known as descriptive complexity, Kolmogorov–Chaitin complexity, algorithmic entropy, or program-size complexity) of an object, such as a piece of text, is a measure of the computational resources needed to specify the object. It is named after Andrey Kolmogorov, who first published on the subject in 1963.

strings

Definition

 * $$\forall s\ |K_1(s) - K_2(s)| \leq c.$$

Proof: By symmetry, it suffices to prove that there is some constant c such that for all bitstrings s


 * $$ K_1(s) \leq K_2(s) + c. $$

Now, suppose there is a program in the language L1 which acts as an interpreter for L2:

function InterpretLanguage(string p)

where p is a program in L2. The interpreter is characterized by the following property:


 * Running InterpretLanguage on input p returns the result of running p.

Thus, if P is a program in L2 which is a minimal description of s, then InterpretLanguage(P) returns the string s. The length of this description of s is the sum of


 * 1) The length of the program InterpretLanguage, which we can take to be the constant c.
 * 2) The length of P which by definition is K2(s).

This proves the desired upper bound.

See also invariance theorem.

History and context
Algorithmic information theory is the area of computer science that studies Kolmogorov complexity and other complexity measures on strings (or other data structures).

The concept and theory of Kolmogorov Complexity is based on a crucial theorem first discovered by Ray Solomonoff, who published it in 1960, describing it in "A Preliminary Report on a General Theory of Inductive Inference" as part of his invention of algorithmic probability. He gave a more complete description in his 1964 publications, "A Formal Theory of Inductive Inference," Part 1 and Part 2 in Information and Control.

Andrey Kolmogorov later independently published this theorem in ''Problems Inform. Transmission, Gregory Chaitin also presents this theorem in J. ACM'' – Chaitin's paper was submitted October 1966 and revised in December 1968, and cites both Solomonoff's and Kolmogorov's papers.

The theorem says that, among algorithms that decode strings from their descriptions (codes), there exists an optimal one. This algorithm, for all strings, allows codes as short as allowed by any other algorithm up to an additive constant that depends on the algorithms, but not on the strings themselves. Solomonoff used this algorithm, and the code lengths it allows, to define a "universal probability" of a string on which inductive inference of the subsequent digits of the string can be based. Kolmogorov used this theorem to define several functions of strings, including complexity, randomness, and information.

When Kolmogorov became aware of Solomonoff's work, he acknowledged Solomonoff's priority. For several years, Solomonoff's work was better known in the Soviet Union than in the Western World. The general consensus in the scientific community, however, was to associate this type of complexity with Kolmogorov, who was concerned with randomness of a sequence, while Algorithmic Probability became associated with Solomonoff, who focused on prediction using his invention of the universal prior probability distribution.

There are several other variants of Kolmogorov complexity or algorithmic information. The most widely used one is based on self-delimiting programs, and is mainly due to Leonid Levin (1974).

An axiomatic approach to Kolmogorov complexity based on Blum axioms (Blum 1967) was introduced by Mark Burgin in the paper presented for publication by Andrey Kolmogorov (Burgin 1982).

Some consider that naming the concept "Kolmogorov complexity" is an example of the Matthew effect.

Theorem: K is not a computable function.

Chain rule for Kolmogorov complexity
The chain rule for Kolmogorov complexity states that


 * $$ K(X,Y) = K(X) + K(Y|X) + O(\log(K(X,Y))).\quad$$

It states that the shortest program that reproduces X and Y is no more than a logarithmic term larger than a program to reproduce X and a program to reproduce Y given X. Using this statement, one can define an analogue of mutual information for Kolmogorov complexity.

Kolmogorov randomness
Kolmogorov randomness – also called algorithmic randomness – defines a string (usually of bits) as being random if and only if it is shorter than any computer program that can produce that string. To make this precise, a universal computer (or universal Turing machine) must be specified, so that "program" means a program for this universal machine. A random string in this sense is "incompressible" in that it is impossible to "compress" the string into a program whose length is shorter than the length of the string itself. A counting argument is used to show that, for any universal computer, there is at least one algorithmically random string of each length. Whether any particular string is random, however, depends on the specific universal computer that is chosen.

This definition can be extended to define a notion of randomness for infinite sequences from a finite alphabet. These algorithmically random sequences can be defined in three equivalent ways. One way uses an effective analogue of measure theory; another uses effective martingales. The third way defines an infinite sequence to be random if the prefix-free Kolmogorov complexity of its initial segments grows quickly enough - there must be a constant c such that the complexity of an initial segment of length n is always at least n&minus;c. This definition, unlike the definition of randomness for a finite string, is not affected by which universal machine is used to define prefix-free Kolmogorov complexity.