User:Lsalgo/Synchronous Relaxation for Parallel Discrete Event Simulations

Synchronous Relaxation (for Parallel Discrete Event Simulations, SR) is a method to perform a discrete event simulation on a parallel computer. SR is general in that it makes no use of specifics of the simulated system and it is computationally efficient in that it typically delivers the speedup of the order of $$N / \log(N) $$, where $${N} $$ is the number of processors in the parallel computer.

When the SR is applicable
A model of a multi-component dynamic system with discrete events might be suitable for a SR simulation. The time would be continuous in such a model. The system state would change only instantaneosly and only by changing the state of a component. Such a change is identified as a discrete event that occurs in that component. The event may affect future events in this and other components. Examples of such models are numerous: telephone networks models, models of Ising spins, models of collision detection of particles, to name a few.

In a circuit-switched telephone network, an event is a change of the occupancy levels of trunks (channels) along a path or a network of trunks (this path or network is called a circuit), when a phone call is allocated (placed) to or de-allocated (removed) from the circuit. Occupancy level of a trunk affects future allocations or de-allocations of this and other trunks via the logic of the allocation algorithm. The comparative behavior of various allocation algorithms is a subject of the study in this simulation, where both, the SR method of parallel simulations and the term "Synchronous Relaxation" itself in the application to parallel simulations, were originally introduced .

A system of Ising spins is a set of atoms in space so that each atom has a few neighbors. Each atom has a state, called "spin". Atoms change spins at asynchronous random times, which the model assumes to be independent of spins. When an atom changes its spin, the new spin is a function of the existing spins of its neighbors and itself .

In a collision detection model, for example, in a model of chaotically colliding billiard balls, the motion of each ball is independent of other balls as long as the paths do not cross. Occupancy conflicts in pairs of balls are resolved through collisions. At a collision, which the model assumes to be instantaneous, the two participating balls change the directions and speeds of their motions and then each continues with the next independent segment of its trajectory .

How the SR works
Having a parallel computer with $$N$$ processing elements (PEs) and assuming the number of system components is not smaller than $$N$$, each component is hosted by a PE, so that the PE is supposed to produce the history of the events of the component.

Each PE will keep track of the simulated time $$t$$ such that on the interval $$[0, t]$$ all histories are known; this $$t$$ is called committed time. Committed time increases in steps, in synchrony among all PEs; its value is common to all PEs. Each step consists of iterations, also performed in synchrony among all PEs, in the sense that no PE starts an iteration before all PEs finish the previous one.

At each iteration, each PE produces a tentative history of the components it hosts beyond the current $$t$$ . The PE extends its local history til its local time reaches $$t + \Delta t $$, where $$\Delta t $$ is the step size of committed time increases. Since the components, in general, depend on each other, in order to produce a correct local history, a PE needs to know the correct local histories of other PEs. But they are not known, because other PEs are in the same quandary.

Successive iterations resolve the quandary. During the first iteration, each PE makes the simplest assumption about the histories of the other PEs. For example, it may assume that the other PEs have no events on the interval $$(t, t + \Delta t] $$ . This assumption enables it to produce its own history on $$(t, t + \Delta t] $$ . During subsequent iterations, if a PE needs to know the local history of another PE, it uses that history generated in the last iteration. The goal of producing correct histories on $$(t, t + \Delta t] $$ is achieved at an iteration at which the history generated by every PE is the same as the one generated at the previous iteration. Once this happens, the new committed time for all PEs becomes $$t + \Delta t $$.

At each iteration of the SR at least one additional event is correctly determined. Therefore, assuming there is a finite number of events to simulate over the simulated time interval $$(t, t + \Delta t] $$, the SR iterations converge for this time interval.

Why the SR is usually efficient
The causal diagram consisting of the events with timestamps restricted to the interval $$(t, t + \Delta t] $$ and of the cause->effect links among them is a directed acyclic graph (DAG). The set of events of the DAG is finite and it can be split into a finite sequence of, say, $$n$$ layers:

- layer 1 consists of events, not pointed by any link

- layer 2 consists of events, that must be pointed by links started with events of layer 1 and must not be pointed by any other link

- layer 3 consists of events, that must be pointed by links started with events of layer 2, and that may be pointed by links started with events of layers 1 and 2, and must not be pointed by any other link

........ - layer $$ n $$ consists of events, that must be pointed by links started with events of layer $$ n- $$1, and that may be pointed by links started with events of layers 1, 2,...$$n-$$1, and must not be pointed by any other link.