Beta normal form

In the lambda calculus, a term is in beta normal form if no beta reduction is possible. A term is in beta-eta normal form if neither a beta reduction nor an eta reduction is possible. A term is in head normal form if there is no beta-redex in head position. The normal form of a term, if one exists, is unique (as a corollary of the Church–Rosser theorem). However, a term may have more than one head normal form.

Beta reduction
In the lambda calculus, a beta redex is a term of the form:


 * $$ (\mathbf{\lambda} x . A) M$$.

A redex $$r$$ is in head position in a term $$t$$, if $$t$$ has the following shape (note that application has higher priority than abstraction, and that the formula below is meant to be a lambda-abstraction, not an application):


 * $$ \lambda x_1 \ldots \lambda x_n . \underbrace{(\lambda x . A) M_1}_{\text{the redex }r} M_2 \ldots M_m $$, where $$n \geq 0$$ and $$m \geq 1$$.

A beta reduction is an application of the following rewrite rule to a beta redex contained in a term:


 * $$ (\mathbf{\lambda} x . A) M \longrightarrow A[x := M] $$

where $$A[x := M]$$ is the result of substituting the term $$M$$ for the variable $$x$$ in the term $$A$$.

A head beta reduction is a beta reduction applied in head position, that is, of the following form:


 * $$ \lambda x_1 \ldots \lambda x_n . (\lambda x . A) M_1 M_2 \ldots M_m \longrightarrow

\lambda x_1 \ldots \lambda x_n. A[x := M_1] M_2 \ldots M_m $$, where $$n \geq 0$$ and $$m \geq 1$$.

Any other reduction is an internal beta reduction.

A normal form is a term that does not contain any beta redex, i.e. that cannot be further reduced. A head normal form is a term that does not contain a beta redex in head position, i.e. that cannot be further reduced by a head reduction. When considering the simple lambda calculus (viz. without the addition of constant or function symbols, meant to be reduced by additional delta rule), head normal forms are the terms of the following shape:


 * $$ \lambda x_1 \ldots \lambda x_n . x M_1 M_2 \ldots M_m $$, where $$x$$ is a variable, $$n \geq 0$$ and $$m \geq 0$$.

A head normal form is not always a normal form, because the applied arguments $$M_j$$ need not be normal. However, the converse is true: any normal form is also a head normal form. In fact, the normal forms are exactly the head normal forms in which the subterms $$M_j$$ are themselves normal forms. This gives an inductive syntactic description of normal forms.

There is also the notion of weak head normal form: a term in weak head normal form is either a term in head normal form or a lambda abstraction. This means a redex may appear inside a lambda body.

Reduction strategies
In general, a given term can contain several redexes, hence several different beta reductions could be applied. We may specify a strategy to choose which redex to reduce.


 * Normal-order reduction is the strategy in which one continually applies the rule for beta reduction in head position until no more such reductions are possible. At that point, the resulting term is in head normal form. One then continues applying head reduction in the subterms $$M_j$$, from left to right. Stated otherwise, normal‐order reduction is the strategy that always reduces the left‐most outer‐most redex first.
 * By contrast, in applicative order reduction, one applies the internal reductions first, and then only applies the head reduction when no more internal reductions are possible.

Normal-order reduction is complete, in the sense that if a term has a head normal form, then normal‐order reduction will eventually reach it. By the syntactic description of normal forms above, this entails the same statement for a “fully” normal form (this is the standardization theorem). By contrast, applicative order reduction may not terminate, even when the term has a normal form. For example, using applicative order reduction, the following sequence of reductions is possible:


 * $$\begin{align}

&(\mathbf{\lambda} x . z) ((\lambda w. w w w) (\lambda w. w w w)) \\ \rightarrow &(\lambda x . z) ((\lambda w. w w w) (\lambda w. w w w) (\lambda w. w w w))\\ \rightarrow &(\lambda x . z) ((\lambda w. w w w) (\lambda w. w w w) (\lambda w. w w w) (\lambda w. w w w))\\ \rightarrow &(\lambda x . z) ((\lambda w. w w w) (\lambda w. w w w) (\lambda w. w w w) (\lambda w. w w w) (\lambda w. w w w))\\ &\ldots \end{align}$$

But using normal-order reduction, the same starting point reduces quickly to normal form:


 * $$ (\mathbf{\lambda} x . z) ((\lambda w. w w w) (\lambda w. w w w)) $$
 * $$ \rightarrow z $$

Sinot's director strings is one method by which the computational complexity of beta reduction can be optimized.