User:Thepigdog/Lambda Calculus (old)



Lambda calculus (also written as λ-calculus or called "the lambda calculus") is a formal system in mathematical logic and computer science for expressing computation using variable binding and substitution. First formulated by Alonzo Church, lambda calculus found early successes in the area of computability theory, such as a negative answer to Hilbert's Entscheidungsproblem ("decision problem").

Lambda calculus is a simple yet Turing complete language, which can represent every computable function. Any program may be written in lambda calculus, and its execution can be understood knowing the definition of lambda calculus alone without reference to any external software environment or runtime library. Lambda calculus is functional, which allows a program written in the lambda calculus to be studied through mathematical methods. In particular parts of the program may be analyzed separately from other parts. The lambda calculus is not intended as a programming language for everyday use. However, it is valuable as a theoretical construct, and as a basis for understanding other languages.

Because of the importance of the notion of variable binding and substitution, there is not just one system of lambda calculus, and in particular there are typed and untyped variants. Historically, the most important system was the untyped lambda calculus, in which function application has no restrictions (so the notion of the domain of a function is not built into the system). In the Church–Turing Thesis, the untyped lambda calculus is claimed to be capable of computing all effectively calculable functions. The typed lambda calculus is a variety that restricts function application, so that functions can only be applied if they are capable of accepting the given input's "type" of data.

Today, the lambda calculus has applications in many different areas in mathematics, philosophy, linguistics, and computer science. It is still used in the area of computability theory, although Turing machines are also an important model for computation. Lambda calculus has played an important role in the development of the theory of programming languages. Counterparts to lambda calculus in computer science are functional programming languages, which essentially implement the calculus (augmented with some constants and datatypes). Beyond programming languages, the lambda calculus also has many applications in proof theory. A major example of this is the Curry–Howard correspondence, which gives a correspondence between different systems of typed lambda calculus and systems of formal logic.

Lambda calculus in history of mathematics
The lambda calculus was introduced by mathematician Alonzo Church in the 1930s as part of an investigation into the foundations of mathematics. The original system was shown to be logically inconsistent in 1935 when Stephen Kleene and J. B. Rosser developed the Kleene–Rosser paradox.

Subsequently, in 1936 Church isolated and published just the portion relevant to computation, what is now called the untyped lambda calculus. In 1940, he also introduced a computationally weaker, but logically consistent system, known as the simply typed lambda calculus.

Motivation
Computable functions are a fundamental concept within computer science and mathematics. The λ-calculus provides a simple semantics for computation, enabling properties of computation to be studied formally. The λ-calculus incorporates two simplifications that make this semantics simple.

One first simplification is that the λ-calculus treats functions "anonymously", without giving them explicit names. That is, a function can be defined as a an expression which may called. For example, the function
 * $$\operatorname{sqsum}(x, y) = x \times x + y \times y$$

can be rewritten in anonymous form as
 * $$(x, y) \mapsto x \times x + y \times y$$

(read as “the pair of $$x$$ and $$y$$ is mapped to $$x \times x + y \times y$$”). Similarly,
 * $$\operatorname{id}(x) = x$$

can be rewritten in anonymous form as $$x \mapsto x$$, where the input is simply mapped to itself.

Currying
The second simplification is that the λ-calculus only uses functions of a single input. An ordinary function that requires two inputs, for instance the $$\operatorname{sqsum}$$ function, can be reworked into an equivalent function that accepts a single input, and as output returns another function, that in turn accepts a single input. For example,
 * $$(x, y) \mapsto x \times x + y \times y$$

can be reworked into
 * $$x \mapsto (y \mapsto x \times x + y \times y)$$

This method, known as currying, transforms a function that takes multiple arguments into a chain of functions each with a single argument.

Applying the $$\operatorname{sqsum}$$ function to the arguments (5, 2), yields:
 * $$((x, y) \mapsto x \times x + y \times y)(5, 2)$$
 * $$ = 5 \times 5 + 2 \times 2 = 29$$

The curried version yields:
 * $$((x \mapsto (y \mapsto x \times x + y \times y))(5))(2)$$
 * $$ = (y \mapsto 5 \times 5 + y \times y)(2)$$
 * $$ = 5 \times 5 + 2 \times 2 = 29$$

and thus computes the same result.

The Lambda Calculus
The lambda calculus takes these ideas and transforms them into a language based only on function parameterising and application. A lambda expression looks like,


 * $$f = \lambda xy.x^2+y^2$$

which is the equivalent of writing,


 * $$f(x, y) = x^2 + y^2$$

The difference in notation makes a single representation of the function $$f$$ in one expression. In the same way that we say that arithmetic equations are solved for numeric values, by analogy you may consider the lambda expression to be "solved" for the function variable $$f$$.

Function Application
A key starting point for functional languages, and the lambda calculus, is function application. Function application is considered so important for functional programming that the notation
 * $$k(x)$$ is written as $$k \ x$$

So in calling the function $$f$$ ,defined above, we write,
 * $$f \ 5 \  6$$

which may be written, $$ (\lambda xy.x^2+y^2) 5 \ 6$$ giving the result, $$ 5^2+6^2 $$ or $$61$$

You see in this example that the function $$f$$ may be substituted for it's function expression $$ \lambda xy.x^2+y^2 $$. This makes lambda calculus language useful for thinking about and understanding the nature of computations.

Syntax Definition
Lambda expressions have a simple syntax. Lambda expressions are composed of
 * variables v1, v2, ..., vn, ...
 * the abstraction symbols λ and
 * parentheses

The set of lambda expressions, Λ, can be defined inductively: Instances of rule 2 are known as abstractions and instances of rule 3 are known as applications.
 * 1) If x is a variable, then x ∈ Λ
 * 2) If x is a variable and M ∈ Λ, then (λx.M) ∈ Λ
 * 3) If M, N ∈ Λ, then (M N) ∈ Λ

Notational conventions
To keep the notation of lambda expressions uncluttered, the following conventions are usually applied.
 * Outermost parentheses are dropped: M N instead of (M N)
 * Applications are left associative: M N P may be written instead of ((M N) P)
 * The body of an abstraction extends as far right as possible: λx.M N means λx.(M N) and not (λx.M) N
 * A sequence of abstractions is contracted: λx.λy.λz.N may be abbreviated as λxyz.N

Definition of Semantics
The execution of a lambda expression proceeds using the following reductions and transformations,

where, Execution is performing beta reductions and eta reductions on sub expressions in the canonym of a lambda expression until the result is a lambda function (abstraction) in the normal form.
 * 1) alpha conversion - $$\operatorname{alpha-con}(a) \to \operatorname{canonym}[A, P] = \operatorname{canonym}[a[A], P] $$
 * 2) beta reduction - $$\operatorname{beta-redex}[\lambda \operatorname{fp}.b \operatorname{ap}] = b[\operatorname{fp}:=\operatorname{ap}] $$
 * 3) eta reduction - $$x \not \in \operatorname{FV}(f) \to \operatorname{eta-redex}[\lambda x.(f \  x)] = f $$
 * canonym is a renaming of a lambda expression to give the expression standard names, based on the position of the name in the expression.
 * Substitution Operator, $$b[\operatorname{fp}:=\operatorname{ap}] $$ is the substitution of the name $$\operatorname{fp}$$ by the lambda expression $$\operatorname{ap}$$ in lambda expression $$b$$.
 * Free Variable Set $$\operatorname{FV}(f)$$ is the set of variables that do not belong to a lambda abstraction in $$f$$.

All alpha conversions of a lambda expression are considered to be equivalent.

Canonym - Canonical Names
Canonym is a function that takes a lambda expression and renames all names canonically, based on their positions in the expression. This might be implemented as,
 * 1) $$\operatorname{canonym}[L, P] = \operatorname{canonym}[L, O, P] $$
 * 2) $$\operatorname{canonym}[\lambda \operatorname{fp}.b, M, P] = \lambda \operatorname{name}(P).\operatorname{canonym}[b, M[\operatorname{fp}:=P], P+N] $$
 * 3) $$\operatorname{canonym}[X \  Y, x, P] = \operatorname{canonym}[X, x, P+F] \  \operatorname{canonym}[Y, x, E+S] $$
 * 4) $$\operatorname{canonym}[x, M, P] = \operatorname{name}(M[x]) $$

Where, N is the string "N", F is the string "F", S is the string "S", + is concatenation, and "name" converts a string into a name

Map Operators
Map from one value to another if the value is in the map. O is the empty map.


 * 1) $$O[x] = x $$
 * 2) $$M[x:=y][x] = y $$
 * 3) $$x \ne z \to M[x:=y][z] = M[z] $$

Substitution Operator
If L is a lambda expression, x is a name, and y is a lambda expression; $$L[x:=y]$$ means substitute x by y in L. The rules are,
 * 1) $$(\lambda \operatorname{fp}.b)[x := y] = \lambda \operatorname{fp}.b[x := y] $$
 * 2) $$(X \  Y)[x := y] = X[x := y] \  Y[x := y] $$
 * 3) $$z = x \to (z)[x := y] = y  $$
 * 4) $$z \ne x \to (z)[x := y] = z  $$

Note that rule 1 must be modified if it is to be used on non canonically renamed lambda expressions. See Changes to the substitution operator.

Free and Bound Variable Sets
The set of free variables of a lambda expression, M, is denoted as FV(M). This is the set of variable names that have instances not bound (used) in a lambda abstraction, within the lambda expression. They are the variable names that may be bound to formal parameter variables from outside the lambda expression.

The set of bound variables of a lambda expression, M, is denoted as BV(M). This is the set of variable names that have instances bound (used) in a lambda abstraction, within the lambda expression.

The rules for the two sets are given below.

Usage;
 * The Free Variable Set, FV is used above in the definition of the eta-reduction.
 * The Bound Variable Set, BV, is used in the rule for beta-redex of non canonical lambda expression.

Rules for general use
This section gives a general discussion for how Lambda Calculus is applied and gives rules that apply when lambda expressions are not canonically renamed. When working manually with lambda expressions it is easier to understand using the variable names that the author gave.

Free and bound variables
The lambda abstraction operator, λ, takes a formal parameter variable and a body expression. When evaluated the formal parameter variable is identified with the value of the actual parameter.

Variables in a lambda expression may either be "bound" or "free". Bound variables are variable names that are already attached to formal parameter variables in the expression.

The formal parameter variable is said to bind the variable name wherever it occurs free in the body. Variable (names) that have already been matched to formal parameter variable are said to be bound. All other variables in the expression are called free.

For example, in the following expression y is a bound variable and x is free: $$\lambda y.x \ x \  y$$. Also note that a variable is bound by its "nearest" lambda abstraction. In the following example the single occurrence of x in the expression is bound by the second lambda: $$\lambda x.y \ (\lambda x.z \ x)$$

Changes to the substitution operator
In the definition of the Substitution Operator the rule, must be replaced with,
 * $$\lambda \operatorname{fp}.b)[x := y] = \lambda \operatorname{fp}.b[x := y] $$


 * 1) $$(\lambda x.b)[x := y] = \lambda x.b] $$
 * 2) $$z \ne x\ \to (\lambda z.b)[x := y] = \lambda z.b[x := y] $$

This is to stop bound variables with the same name being substituted. This would not have occurred in a canonically renamed lambda expression.

For example the previous rules would have wrongly translated,
 * $$(\lambda x.x \ z)[x:=y] = (\lambda x.y \  z)$$

The new rules block this substitution so that it remains as,
 * $$(\lambda x.x \ z)[x:=y] = (\lambda x.x \  z)$$

Transformation
The meaning of lambda expressions is defined by how expressions can be transformed or reduced.

There are three kinds of transformation: We also speak of the resulting equivalences: two expressions are β-equivalent, if they can be β-converted into the same expression, and α/η-equivalence are defined similarly.
 * α-conversion: changing bound variables (alpha);
 * β-reduction: applying functions to their arguments (beta), calling functions;
 * η-conversion: which captures a notion of extensionality (eta).

The term redex, short for reducible expression, refers to subterms that can be reduced by one of the reduction rules.

Alpha Conversion
Alpha-conversion, sometimes known as alpha-renaming, allows bound variable names to be changed. For example, alpha-conversion of $$\lambda x.x$$ might give $$\lambda y.y$$. Terms that differ only by alpha-conversion are called α-equivalent.

In an alpha conversion, names may be substituted for new names if the new name is not free in the body, as this would lead to the capture of free variables.


 * $$(y \not \in FV(b) \land a(\lambda x.b) = \lambda y.b[x:=y]) \to \operatorname{alpha-con}(a) $$

Note that the substitution will not recurse into the body of lambda expressions with formal parameter $$x$$ because of the change to the substitution operator described above.

See example;

Beta reduction (capture avoiding)
Beta-reduction captures the idea of function application (also called a function call), and implements the substitution of the actual parameter expression for the formal parameter variable. Beta-reduction is defined in terms of substitution.

If no variable names are free in the actual parameter and bound in the body, beta reduction may be performed on the the lambda abstraction without canonical renaming.


 * $$(\forall z: z \not \in FV(y) \lor z \not \in BV(b)) \to \operatorname{beta-redex}[\lambda x.b \ y] = b[x:=y] $$

Alpha renaming may be used on $$b$$ to rename names that are free in $$y$$ but bound in $$b$$, to meet the pre-condition for this transformation.

See example;


 * 1) $$((\lambda x.z \  x)(\lambda y.z \  y))[z := (x \  y)] $$
 * 2) $$((\lambda a.z \  a)(\lambda b.z \  b))[z := (x \  y)] $$

In this example,
 * 1) In the beta-redex,
 * 2) The free variables are, $$FV(x \  y) = \{x, y\} $$
 * 3) The bound variables are, $$BV((\lambda x.z \  x)(\lambda y.z \  y)) = \{x, y\} $$
 * 4) The naive beta-redex changed the meaning of the expression because x and y from the actual parameter became captured when the expressions were substituted in the inner abstractions.
 * 5) The alpha renaming removed the problem by changing the names of x and y in the inner abstraction so that they are distinct from the names of x and y in the actual parameter.
 * 6) The free variables are, $$\operatorname{FV}(x \  y) = \{x, y\} $$
 * 7) The bound variables are, $$\operatorname{BV}((\lambda a.z \  a)(\lambda b.z \  b)) = \{a, b\} $$
 * 8) The beta-redex then proceeded with the intended meaning.

Eta reduction
Eta-conversion expresses the idea of extensionality, which in this context is that two functions are the same if and only if they give the same result for all arguments.

Eta reduction may be used without change on lambda expressions that are not canonically renamed.


 * $$x \not \in \operatorname{FV}(f) \to \operatorname{eta-redex}[\lambda x.(f x)] = f $$

The problem with using an eta-redex when f has free variables is shown in this example,

This improper use of eta-reduction changes the meaning by leaving $$x$$ in $$\lambda y.y \  x $$ unsubstituted.

Normalization
The purpose of beta-reduction is to calculate a value. A value in Lambda Calculus is a function. So beta-reduction continues until the expression looks like a function abstraction.

An lambda expression that cannot be reduced further, by either beta-redex, or eta-redex is in normal form. Note that alpha-conversion may convert functions. All normal forms that can be converted into each other by alpha conversion are defined to be equal. See the main article on Beta normal form for details.

Encoding datatypes
Although the Lambda calculus expression only evaluate to functions, it is possible to interpret some functions as Boolean, or numeric values. This may be achieved because there are lambda expressions that behave like the basic types. These types are described below.


 * Arithmetic in lambda calculus
 * Logic and predicates
 * Pairs

Using these encodings it is possible to write programs about logic or arithmetic, without using any external libraries. All may be constructed out of the machinery of lambda calculus.

Combinatory logic
Combinatory logic is based on (or implemented in) Lambda calculus. It was introduced by Moses Schönfinkel and Haskell Curry to implement the idea that the names of the variables are unimportant. Only the variable identity is important. Combinatory logic uses some closed lambda expressions (called combinators) to implement this idea.

The basic idea is that the combinators act like plumbing parts, that may be put together to redirect the variable value where it needs to go. Combinators are used in the study of the structure of programs.

The fixed point combinator is an interesting function that takes a function and computes the value that the function maps to the same value. This is also known as Curry's Paradoxical Combinator because it may be used to implement a form of Curry's Paradox.

Lambda Lifting
Lambda lifting may be used to extract function definitions from a lambda expression and converts a lambda expression into a functional form, with no lambda abstractions.

This demonstrates the equivalence of Lambda Calculus and equations written in a functional form. Lambda Lifting may also be considered as an alternate definition for Lambda Calculus.

Each lift may be considered as a refactoring step, in that it extracts a sub expression, encapsulates it, and safely moves it into a separate function.

Some older style compilers use Lambda Lifting as a compilation step in implementing a functional language.

Influence on other languages
Lambda Calculus has directly influenced function languages such as ML and Haskell. It has also influence many imperative languages, by providing the motivation for features. For example:
 * Closures
 * Anonymous functions
 * Reified functions. Functions as First-class citizen.

Deductive Lambda Calculus
Deductive Lambda Calculus

Denotational Semantics
The fact that lambda calculus terms act as functions on other lambda calculus terms, and even on themselves, led to questions about the semantics of the lambda calculus. Could a sensible meaning be assigned to lambda calculus terms? The natural semantics was to find a set D isomorphic to the function space D → D, of functions on itself. However, no nontrivial such D can exist, by cardinality constraints because the set of all functions from D into D has greater cardinality than D.

In the 1970s, Dana Scott showed that, if only continuous functions were considered, a set or domain D with the required property could be found, thus providing a model for the lambda calculus.

This work also formed the basis for the denotational semantics of programming languages.

Typed lambda calculus
The Lambda calculus defined in this article has no type declarations. All lambda expressions represent functions, but the type of the function is not explicitly declared. Various type systems may be added to Lambda Calculus, to both constrain it and give it more expressive power. Lambda Calculus, with a type systems is the basis for functional languages, such as ML and Haskell.

Typed lambda calculus is the study of type systems for Lambda Calculus. The Lambda cube is a framework for studying type systems, starting with the Simply typed lambda calculus.

Normal forms and confluence
The evaluation of Lambda calculus is defined by Beta Reductions, which may be considered as rewrite rules. That is, the execution of the Lambda calculus proceeds by a series of Beta Reduction steps to transform the program into the calculated value.

A rewrite system may be,
 * Strongly normalizing; Every sequence of rewrites eventually terminates to a term in normal form.
 * Weak normalization; For each term, there exists at least one particular sequence of rewrites that eventually yields a normal form.

Considered as a rewriting rule, β-reduction is neither strongly normalising nor weakly normalising.

β-reduction is confluent, which means there is more than one way of writing the calculation at a particular state in the computation. This is because α-conversion, can rename variables.

Lambda Calculus is Turing Complete
Lambda Calculus is Turing complete without any extensions or types being added to the language. All normal forms (values) in Lambda Calculus however functions. However Church showed that by using some functions to represent other values such as Boolean or integer. In this way he showed that Lambda Calculus is Turing Complete. This is the Church-Turing thesis for a discussion of other approaches and their equivalence.

Undecidability of equivalence
There is no computable algorithm that can determine whether or not two lambda expressions are equivalent. This was historically the first problem for which undecidability could be proven. As is common for a proof of undecidability, the proof shows that no computable function can decide the equivalence. Church's thesis is then invoked to show that no algorithm can do so.

Church's proof reduces the problem to determining whether a given lambda expression has a normal form. A normal form is an equivalent expression that cannot be reduced any further under the rules imposed by the form.

Then he assumes that this predicate is computable, and can hence be expressed in lambda calculus. Building on earlier work by Kleene and constructing a Gödel numbering for lambda expressions, he constructs a lambda expression e that closely follows the proof of Gödel's first incompleteness theorem. If e is applied to its own Gödel number, a contradiction results.

Reduction strategies
Whether a term is normalizing or not, and how much work needs to be done in normalizing it if it is, depends to a large extent on the reduction strategy used. The distinction between reduction strategies relates to the distinction in functional programming languages between eager evaluation and lazy evaluation.

Strict/eager evaluation strategies
Most programming languages (including Lisp, ML and imperative languages like C and Java) are described as "strict", meaning that functions applied to non-normalising arguments are non-normalising. This is done essentially using applicative order, call by value reduction (see below), but usually called "eager evaluation".

Applicative order
The rightmost, innermost redex is always reduced first. Intuitively this means a function's arguments are always reduced before the function itself. Applicative order always attempts to apply functions to normal forms, even when this is not possible.

Applicative order is not a normalizing strategy. The usual counterexample is as follows: define Ω = ωω where ω = λx.xx. This entire expression contains only one redex, namely the whole expression; its beta-reduction is again Ω. Since this is the only available reduction, Ω has no normal form (under any evaluation strategy). Using applicative order, the expression KIΩ = (λx.λy.x) (λx.x)Ω is reduced by first reducing Ω to normal form (since it is the rightmost redex), but since <tt>Ω</tt> has no normal form, applicative order fails to find a normal form for <tt>KIΩ</tt>.

The positive trade-off of using applicative order is that it does not cause unnecessary computation, if all arguments are used, because it never substitutes arguments containing redexes and hence never needs to copy them (which would duplicate work). In the above example, in applicative order <tt>(λx.xx) ((λx.x)y)</tt> reduces first to <tt>(λx.xx)y</tt> and then to the normal order <tt>yy</tt>, taking two steps instead of three.

Call by value
Only the outermost redexes are reduced: a redex is reduced only when its right hand side has reduced to a value (variable or lambda abstraction).

Lazy evaluation strategies
Most purely functional programming languages (notably Miranda and its descendents, including Haskell), and the proof languages of theorem provers, use lazy evaluation, which is essentially the same as call by need.

Normal order
The leftmost, outermost redex is always reduced first. Whenever possible the arguments are substituted into the body of an abstraction before the arguments are reduced.

Normal order is so called because it always finds a normalising reduction, if one exists. In the above example, <tt>KIΩ</tt> reduces under normal order to I, a normal form. A drawback is that redexes in the arguments may be copied, resulting in duplicated computation (for example, <tt>(λx.xx) ((λx.x)y)</tt> reduces to <tt>((λx.x)y) ((λx.x)y)</tt> using this strategy; now there are two redexes, so full evaluation needs two more steps, but if the argument had been reduced first, there would now be none).

Call by name
As normal order, but no reductions are performed inside the body of abstractions. For example <tt>λx.(λx.x)x</tt> is in normal form according to this strategy, although it contains the redex <tt>(λx.x)x</tt>.

Call by need
As normal order, but function applications that would duplicate terms instead name the argument, which is then reduced only "when it is needed". Called in practical contexts "lazy evaluation". In implementations this "name" takes the form of a pointer, with the redex represented by a thunk.

For example, <tt>(λx.xx) ((λx.x)y)</tt> reduces to <tt>((λx.x)y) ((λx.x)y)</tt>, which has two redexes, but in call by need they are represented using the same object rather than copied, so when one is reduced the other is too.

Parallel evaluation strategies
The Church-Rosser property of the lambda calculus means that evaluation (β-reduction) can be carried out in any order, even in parallel. This means that various parallel evaluation strategies are possible.

Unordered beta reductions
Any redex can be reduced at any time. This means essentially the lack of any particular reduction strategy.

Parallelism and concurrency
The following strategies are related to Lambda Calculus.
 * Futures
 * process calculi

Director Strings Optimization
While the idea of beta reduction seems simple enough, it is computationally expensive. The interpreter must find the location of all of the occurrences of the bound variable <tt>V</tt> in the expression <tt>E</tt>. As an optimization the interpreter may track the locations of free variables, which has a space cost.

A naïve search for the locations of <tt>V</tt> in <tt>E</tt> is O(n) in the length n of <tt>E</tt>. This has led to the study of systems that use explicit substitution. Sinot's director strings offer a way of tracking the locations of free variables in expressions.