Persistent memory

In computer science, persistent memory is any method or apparatus for efficiently storing data structures such that they can continue to be accessed using memory instructions or memory APIs even after the end of the process that created or last modified them.

Often confused with non-volatile random-access memory (NVRAM), persistent memory is instead more closely linked to the concept of persistence in its emphasis on program state that exists outside the fault zone of the process that created it. (A process is a program under execution. The fault zone of a process is that subset of program state which could be corrupted by the process continuing to execute after incurring a fault, for instance due to an unreliable component used in the computer executing the program.)

Efficient, memory-like access is the defining characteristic of persistent memory. It can be provided using microprocessor memory instructions, such as load and store. It can also be provided using APIs that implement remote direct memory access (RDMA) actions, such as RDMA read and RDMA write. Other low-latency methods that allow byte-grain access to data also qualify.

Persistent memory capabilities extend beyond non-volatility of stored bits. For instance, the loss of key metadata, such as page table entries or other constructs that translate virtual addresses to physical addresses, may render durable bits non-persistent. In this respect, persistent memory resembles more abstract forms of computer storage, such as file systems. In fact, almost all existing persistent memory technologies implement at least a basic file system that can be used for associating names or identifiers with stored extents, and at a minimum provide file system methods that can be used for naming and allocating such extents.

The read-of-non-persistent-write problem
The read-of-non-persistent-write problem is found for lock-free programs on persistent memory. As compare-and-swap (CAS) operations do not persist the written values to persistent memory, the modified data can be made visible by the cache coherence protocol to a concurrent observer before the modified data can be observed by a crash observer at persistent memory. If a power failure happens right after the write is made visible but not yet persistent, the read-of-non-persistent-write problem can occur, i.e., a data variable that is modified by a compare-and-swap operation can be made visible to a concurrent observer before a crash observer, causing potential crash inconsistencies.

To illustrate the problem: for a singly linked lock-free list, a node can be inserted by a producer  after the   node, the   pointer of the head node gets atomically switched (CAS) to point to the new , however, this CAS is not persisted. Then, another node gets inserted by producer  after , as CAS for   is already visible to all concurrent threads. CAS atomically switches the  pointer of   to point to , and this CAS gets persisted. If a power failure happens at this point, the application that uses the linked list would be left in an inconsistent state, with both  and   lost, as the   pointer from the   node to   has not been persisted. As  has been published but can’t be accessed after a reboot, and other data may have been persisted that are accessed through or dependent on , all subsequent accesses to such data will not be possible, causing data loss.

The read-of-non-persistent-write problem is not limited to lock-free linked lists, it can be found in any lock-free data structures where the potential gap between concurrent visibility and persistent visibility can exist. For instance, a similar problem can occur with persistent circular buffers.