User:Vkadimi/sandbox

Introduction:
In Distributed shared memory (DSM) Multiprocessor systems where there is no bus involved, outstanding transaction buffers are buffers to store the requests that are sent by the processor or node to the home directory (memory where a block maps is referred to as home node of the block) but did not receive a response yet. In DSM systems, directory state may be out of sync with the cache state. And also, as it is not a bus based scheme, it is possible that messages from multiple processors can be overlapped. For handling these race conditions, an outstanding transaction buffer is used at the processor, to ensure coherence.

Outstanding transaction buffers can be found in SGI Origin2000 hardware, with a buffer that allows four outstanding requests for each of its two processors.

=== Need and the cases of use: ===

Races due to Out-of-Sync Directory State in DSM systems (case 1) :

 * Home directory is not updated in all the cases where there is an update at the cache. If there is a silent eviction (eviction of a block with out updating the directory) from the cache that is not propagated to the directory, the directory state may be out of sync with cache.
 * For example, consider the case where directory has state of cache as exclusive-modified (EM) and the block is not actually cached, due to silent eviction. In this case, if the node sends a Read (read request from a processor) or ReadX (read exclusive (write) request from a processor that does not already have the block) request to the directory, the directory cannot just reply with data because it might have stale data if the evicted block was a dirty block. It cannot wait for flushed block either, because the evicted block could be clean and evicted in exclusive state by the requestor.
 * This cannot be solved by any additional implementation at directory. Some implementation at the processor has to be done in the coherence controller, to solve this problem. This is where outstanding transaction buffer is needed for assuring correct functionality.

Races due to Non-instantaneous processing of a request (case 2) :

 * As in real time systems, it is not possible to process a request instantaneously because of delays in communication. So it is possible that messages from two different requests get overlapped.
 * For example, let us say we have two nodes, A and B, making request to same block. The home node H, has the directory information. Initially, the block is in shared state at directory and node A makes a read request (1 in Fig.1) and directory replies with data and updates sharing bit vector with A as being a sharer as well. The ReplyD (home replies to the requestor with data of the memory block) message (2 in Fig.1) from directory to node A gets delayed due to some reason. Now, node B, sends a ReadX request (3 in Fig.1) to home. As the home is under impression that node A is a sharer now, the home sends an invalidation message (4 in Fig.1) to A. In some networks with multiple paths between a pair of nodes, it is possible that the latest invalidation (Inv) message reaches node A before previous ReplyD message.
 * So node A might process the replies in an order different from the order they were sent by the home node. As a result, the invalidation may be processed first, and later, the data sent by home node will be updated. So the final state of A is incorrect. This is called “Early Invalidation race”.
 * This can be solved in two ways. One is home centric scheme, in which home decides whether the processing of request is complete after receiving ACK (acknowledgement) messages from the requestors. But this has a drawback of having huge latency because of delay of the ACK messages. The other way of solving the race condition is requestor-assisted scheme, in which requestor maintains outstanding transaction buffer at the coherence controller to ensure the ordering of request processing.

Handling races in DSM using Outstanding Transaction Buffer:
The processor keeps an outstanding transaction buffer at its coherence controller to record the outstanding read, upgrade and read exclusive requests issued by it. In case of upgrade request by requestor, home node sends the expected number of invalidation acknowledgements (InvACK). The requestor node stores the count in the outstanding transaction buffer and decrements it on receiving an InvACK.
 * In the case 1 discussed above, if the processor evicts a dirty block and flushes the block to main memory at home node, it will track whether the write back has been completed or not. Initially, the requestor will store this request in outstanding transaction buffer and when it receives an acknowledgement from home, it removes the request from the buffer. The processor also stalls Read or ReadX requests to the block that is being flushed until the ACK is received. This ensures that directory never sees a Read/ReadX request to a block from a node flushing it. So it can safely reply with data to requestor, as the previous eviction must be clean eviction.
 * In the case 2 above, processor maintains an outstanding transaction buffer at coherence controller which stores the requests that were sent to home and has not received response yet. So it stores the initial Read request in the buffer and waits for the response. If this is the first request to the directory, it responds with data; or if it already processing other request, it will send NACK message. So requestor has to get either data or a NACK message. If instead it gets any different message (i.e Inv in our case), it concludes it as a protocol race and doesn’t process that message until it gets data or NACK message. Those messages that could not be processed can be stored in some other buffer so that they can be processed later.