Unifying Theories of Programming

Unifying Theories of Programming (UTP) in computer science deals with program semantics. It shows how denotational semantics, operational semantics and algebraic semantics can be combined in a unified framework for the formal specification, design and implementation of programs and computer systems.

The book of this title by C.A.R. Hoare and He Jifeng was published in the Prentice Hall International Series in Computer Science in 1998 and is now freely available on the web.

Theories
The semantic foundation of the UTP is the first-order predicate calculus, augmented with fixed point constructs from second-order logic. Following the tradition of Eric Hehner, programs are predicates in the UTP, and there is no distinction between programs and specifications at the semantic level. In the words of Hoare:

"A computer program is identified with the strongest predicate describing every relevant observation that can be made of the behaviour of a computer executing that program."

In UTP parlance, a theory is a model of a particular programming paradigm. A UTP theory is composed of three ingredients:


 * an alphabet, which is a set of variable names denoting the attributes of the paradigm that can be observed by an external entity;
 * a signature, which is the set of programming language constructs intrinsic to the paradigm; and
 * a collection of healthiness conditions, which define the space of programs that fit within the paradigm. These healthiness conditions are typically expressed as monotonic idempotent predicate transformers.

Program refinement is an important concept in the UTP. A program $$P_1$$ is refined by $$P_2$$ if and only if every observation that can be made of $$P_2$$ is also an observation of $$P_1$$. The definition of refinement is common across UTP theories:

$$P_1 \sqsubseteq P_2 \quad\text{if and only if}\quad \left[ P_2 \Rightarrow P_1 \right]$$

where $$\left[ X \right]$$ denotes the universal closure of all variables in the alphabet.

Relations
The most basic UTP theory is the alphabetised predicate calculus, which has no alphabet restrictions or healthiness conditions. The theory of relations is slightly more specialised, since a relation's alphabet may consist of only:


 * undecorated variables ($$v$$), modelling an observation of the program at the start of its execution; and
 * primed variables ($$v'$$), modelling an observation of the program at a later stage of its execution.

Some common language constructs can be defined in the theory of relations as follows:


 * The skip statement, which does not alter the program state in any way, is modelled as the relational identity:

$$\mathbf{skip} \equiv v' = v$$


 * The assignment of value $$E$$ to a variable $$a$$ is modelled as setting $$a'$$ to $$E$$ and keeping all other variables (denoted by $$u$$) constant:

$$a := E \equiv  a' = E \land u' = u$$


 * The sequential composition of two programs is just relational composition of intermediate state:

$$P_1 ; P_2 \equiv  \exists v_0 \bullet P_1 [ v_0 / v' ]  \land  P_2 [ v_0 / v ]$$


 * Non-deterministic choice between programs is their greatest lower bound:

$$P_1 \sqcap P_2 \equiv  P_1 \lor P_2$$


 * Conditional choice between programs is written using infix notation:

$$P_1 \triangleleft C \triangleright P_2 \equiv  ( C \land P_1 ) \lor ( \lnot C \land P_2 )$$


 * A semantics for recursion is given by the least fixed point $$\mu \mathbf{F}$$ of a monotonic predicate transformer $$\mathbf{F}$$:

$$\mu X \bullet \mathbf{F}(X) \equiv  \sqcap \left\{ X  \mid  \mathbf{F}(X) \sqsubseteq X \right\}$$