Retiming

Retiming is the technique of moving the structural location of latches or registers in a digital circuit to improve its performance, area, and/or power characteristics in such a way that preserves its functional behavior at its outputs. Retiming was first described by Charles E. Leiserson and James B. Saxe in 1983.

The technique uses a directed graph where the vertices represent asynchronous combinational blocks and the directed edges represent a series of registers or latches (the number of registers or latches can be zero). Each vertex has a value corresponding to the delay through the combinational circuit it represents. After doing this, one can attempt to optimize the circuit by pushing registers from output to input and vice versa - much like bubble pushing. Two operations can be used - deleting a register from each input of a vertex while adding a register to all outputs, and conversely adding a register to each input of vertex and deleting a register from all outputs. In all cases, if the rules are followed, the circuit will have the same functional behavior as it did before retiming.

Formal description
The initial formulation of the retiming problem as described by Leiserson and Saxe is as follows. Given a directed graph $$G:=(V,E)$$ whose vertices represent logic gates or combinational delay elements in a circuit, assume there is a directed edge $$e:=(u,v)$$ between two elements that are connected directly or through one or more registers. Let the weight of each edge $$w(e)$$ be the number of registers present along edge $$e$$ in the initial circuit. Let $$d(v)$$ be the propagation delay through vertex $$v$$. The goal in retiming is to compute an integer lag value $$r(v)$$ for each vertex such that the retimed weight $$w_r(e):=w(e)+r(v)-r(u)$$ of every edge is non-negative. There is a proof that this preserves the output functionality.

Minimizing the clock period with network flow
The most common use of retiming is to minimize the clock period. A simple technique to optimize the clock period is to search for the minimum feasible period (e.g. using binary search).

The feasibility of a clock period $$T$$ can be checked in one of several ways. The linear program below is feasible if and only if $$T$$ is a feasible clock period. Let $$W(u,v)$$ be the minimum number of registers along any path from $$u$$ to $$v$$ (if such a path exists), and $$D(u,v)$$ is the maximum delay along any path from $$u$$ to $$v$$ with W(u,v) registers. The dual of this program is a minimum cost circulation problem, which can be solved efficiently as a network problem. The limitations of this approach arise from the enumeration and size of the $$W$$ and $$D$$ matrices.

Minimizing the clock period with MILP
Alternatively, feasibility of a clock period $$T$$ can be expressed as a mixed-integer linear program (MILP). A solution will exist and a valid lag function $$r(v)$$ will be returned if and only if the period is feasible.

Other formulations and extensions
Alternate formulations allow the minimization of the register count and the minimization of the register count under a delay constraint. The initial paper includes extensions that allow the consideration of fan-out sharing and a more general delay model. Subsequent work has addressed the inclusion of register delays, load-dependent delay models, and hold constraints.

Problems
Retiming has found industrial use, albeit sporadic. Its primary drawback is that the state encoding of the circuit is destroyed, making debugging, testing, and verification substantially more difficult. Some retimings may also require complicated initialization logic to have the circuit start in an identical initial state. Finally, the changes in the circuit's topology have consequences in other logical and physical synthesis steps that make design closure difficult.

Alternatives
Clock skew scheduling is a related technique for optimizing sequential circuits. Whereas retiming relocates the structural position of the registers, clock skew scheduling moves their temporal position by scheduling the arrival time of the clock signals. The lower bound of the achievable minimum clock period of both techniques is the maximum mean cycle time (i.e. the total combinational delay along any path divided by the number of registers along it).