User:Fangjian ma/sandbox

In a computer operating system that uses paging for virtual memory management, page replacement algorithms decide which memory pages to page out (swap out, write to disk) when a page of memory needs to be allocated. Paging happens when a page fault occurs and a free page cannot be used to satisfy the allocation, either because there are none, or because the number of free pages is lower than some threshold.

When the page that was selected for replacement and paged out is referenced again it has to be paged in (read in from disk), and this involves waiting for I/O completion. This determines the quality of the page replacement algorithm: the less time waiting for page-ins, the better the algorithm. A page replacement algorithm looks at the limited information about accesses to the pages provided by hardware, and tries to guess which pages should be replaced to minimize the total number of page misses, while balancing this with the costs (primary storage and processor time) of the algorithm itself.

The page replacing problem is a typical online problem from the competitive analysis perspective in the sense that optimal deterministic algorithm is known.

History
Page replacement algorithms were a hot topic of research and debate in the 1960s and 1970s. That mostly ended with the development of sophisticated LRU approximations and working set algorithms. Since then, some basic assumptions made by the traditional page replacement algorithms were invalidated, resulting in a revival of research. In particular, the following trends in the behavior of underlying hardware and user-level software have affected the performance of page replacement algorithms:


 * Size of primary storage has increased by multiple orders of magnitude. With several gigabytes of primary memory, algorithms that require a periodic check of each and every memory frame are becoming less and less practical.
 * Memory hierarchies have grown taller. The cost of a CPU cache miss is far more expensive. This exacerbates the previous problem.
 * Locality of reference of user software has weakened. This is mostly attributed to the spread of object-oriented programming techniques that favor large numbers of small functions, use of sophisticated data structures like trees and hash tables that tend to result in chaotic memory reference patterns, and the advent of garbage collection that drastically changed memory access behavior of applications.

Requirements for page replacement algorithms have changed due to differences in operating system kernel architectures. In particular, most modern OS kernels have unified virtual memory and file system caches, requiring the page replacement algorithm to select a page from among the pages of both user program virtual address spaces and cached files. The latter pages have specific properties. For example, they can be locked, or can have write ordering requirements imposed by journaling. Moreover, as the goal of page replacement is to minimize total time waiting for memory, it has to take into account memory requirements imposed by other kernel sub-systems that allocate memory. As a result, page replacement in modern kernels (Linux, FreeBSD, and Solaris) tends to work at the level of a general purpose kernel memory allocator, rather than at the higher level of a virtual memory subsystem.

The theoretically optimal page replacement algorithm
The theoretically optimal page replacement algorithm (also known as OPT, clairvoyant replacement algorithm, or Bélády's optimal page replacement policy)  is an algorithm that works as follows: when a page needs to be swapped in, the operating system swaps out the page whose next use will occur farthest in the future. For example, a page that is not going to be used for the next 6 seconds will be swapped out over a page that is going to be used within the next 0.4 seconds. This algorithm cannot be implemented in the general purpose operating system because it is impossible to compute reliably how long it will be before a page is going to be used. But due to its optimality, it is a good way to measure the efficiency of an online algorithms by comparing it with the optimal algorithm.

The (h,k)-Paging problem
The (h,k)-Paging problem is a generalization of the model of paging problem: Let h,k be positive integers that $$h \leq k$$. We measure the performance of an algorithm with cache of size $$h \leq k$$ relative to the *The theoretically optimal page replacement algorithm. If $$h<k$$ we provide the optimal page replacement algorithm with strictly less resource.

The (h,k)-Paging problem is a way to measure how an online algorithm performs by comparing it with the performance of the optimal algorithm, specifically, separately parameterizing the cache size of the online algorithm and optimal algorithm. competitive ratio is used to describe the comparison.

Analysis to randomized paging algorithm
With randomized paging algorithm, it is necessary to define the "worst case" as in deterministic algorithm. So when measuring randomized algorithm, adversary model is useful. To be more specific when considering the randomized paging algorithms, there are three types of adversaries:


 * OBL(oblivious): must construct the request sequence in advance and pays optimally.
 * ADON(adaptive-online): serves the current request online and then chooses the next request based on the online algorithm's actions.
 * ADOF(adaptive-offline): chooses the next request based on the online algorithm's actions thus far, but pays the optimal offline cost to service the resulting sequence.

With the three adversaries, it is clear how to compare the randomized algorithm with the offline optimal algorithm.

Some general class of algorithms
It is easy to measure the performance of algorithms by classifying them into general classes and do some analysis on these classes.

Marking algorithms
Marking algorithms is a general class of paging algorithms. For each page, we associate it with a bot called its mark. Initially, we set all pages as unmarked. During a stage of page requests, we mark a page when it is first requested in this stage. A marking algorithm is such an algorithm that never pages out a marked page.

If ALG is a marking algorithm with a cache of size k, and OPT is the optimal algorithm with a cache of $$h \leq k$$. Then ALG is $$ \dfrac{k}{k-h+1}$$-competitive. So every marking algorithm attains the $$ \dfrac{k}{k-h+1}$$-competitive ratio.

LRU and CLOCK are marking algorithms while FIFO is not a marking algorithm.

Conservative algorithms
An algorithm is conservative, if on any consecutive request sequence containing k or fewer distinct page references, the algorithm will incur k or fewer page faults.

If ALG is a conservative algorithm with a cache of size k, and OPT is the optimal algorithm with a cache of $$h \leq k$$. Then ALG is $$ \dfrac{k}{k-h+1}$$-competitive. So every conservative algorithm attains the $$ \dfrac{k}{k-h+1}$$-competitive ratio.

LRU, FIFO and CLOCK are conservative algorithms.

Page replacement algorithms
There are a variety of page replacement algorithms :

Not recently used
Replace a page that has not refereed and modified recently. The not recently used (NRU) page replacement algorithm is an algorithm that favours keeping pages in memory that have been recently used. This algorithm works on the following principle: when a page is referenced, a referenced bit is set for that page, marking it as referenced. Similarly, when a page is modified (written to), a modified bit is set. The setting of the bits is usually done by the hardware, although it is possible to do so on the software level as well.

At a certain fixed time interval, the clock interrupt triggers and clears the referenced bit of all the pages, so only pages referenced within the current clock interval are marked with a referenced bit. When a page needs to be replaced, the operating system divides the pages into four classes:

not referenced, not modified not referenced, modified referenced, not modified referenced, modified 

Although it does not seem possible for a page to be not referenced yet modified, this happens when a class 3 page has its referenced bit cleared by the clock interrupt. The NRU algorithm picks a random page from the lowest category for removal. Note that this algorithm implies that a modified (within clock interval) but not referenced page is less important than a not modified page that is intensely referenced.

NRU is a marking algorithm, so it is $$ \dfrac{k}{k-h+1}$$-competitive.

First-in, first-out
Replace a page that had stayed in cache longest. The simplest page-replacement algorithm is a FIFO algorithm. The first-in, first-out (FIFO) page replacement algorithm is a low-overhead algorithm that requires little book-keeping on the part of the operating system. The idea is obvious from the name - the operating system keeps track of all the pages in memory in a queue, with the most recent arrival at the back, and the earliest arrival in front. When a page needs to be replaced, the page at the front of the queue (the oldest page) is selected. While FIFO is cheap and intuitive, it performs poorly in practical application. Thus, it is rarely used in its unmodified form. This algorithm experiences Bélády's anomaly.

FIFO page replacement algorithm is used by the VAX/VMS operating system, with some modifications. Partial second chance is provided by skipping a limited number of entries with valid translation table references, and additionally, pages are displaced from process working set to a systemwide pool from which they can be recovered if not already re-used.

FIFO is a conservative algorithm, so it is $$ \dfrac{k}{k-h+1}$$-competitive.

Second-chance
Replace a page that had stayed in cache longest and not used after last eviction happened. A modified form of the FIFO page replacement algorithm, known as the Second-chance page replacement algorithm, fares relatively better than FIFO at little cost for the improvement. It works by looking at the front of the queue as FIFO does, but instead of immediately paging out that page, it checks to see if its referenced bit is set. If it is not set, the page is swapped out. Otherwise, the referenced bit is cleared, the page is inserted at the back of the queue (as if it were a new page) and this process is repeated. This can also be thought of as a circular queue. If all the pages have their referenced bit set, on the second encounter of the first page in the list, that page will be swapped out, as it now has its referenced bit cleared. If all the pages have their reference bit set then second chance algorithm degenerates into pure FIFO.

As its name suggests, Second-chance gives every page a "second-chance" - an old page that has been referenced is probably in use, and should not be swapped out over a new page that has not been referenced.

Clock
Clock is a more efficient version of FIFO than Second-chance because pages don't have to be constantly pushed to the back of the list, but it performs the same general function as Second-Chance. The clock algorithm keeps a circular list of pages in memory, with the "hand" (iterator) pointing to the last examined page frame in the list. When a page fault occurs and no empty frames exist, then the R (referenced) bit is inspected at the hand's location. If R is 0, the new page is put in place of the page the "hand" points to, otherwise the R bit is cleared. Then, the clock hand is incremented and the process is repeated until a page is replaced.

CLOCK is a conservative algorithm, so it is $$ \dfrac{k}{k-h+1}$$-competitive.

Variants on Clock

 * GCLOCK: Generalized clock page replacement algorithm.
 * Clock-Pro keeps a circular list of information about recently-referenced pages, including all M pages in memory as well as the most recent M pages that have been paged out. This extra information on paged-out pages, like the similar information maintained by ARC, helps it work better than LRU on large loops and one-time scans.
 * WSclock. The "aging" algorithm and the "WSClock" algorithm are probably the most important page replacement algorithms in practice.
 * Clock with Adaptive Replacement (CAR) is a page replacement algorithm that has performance comparable to ARC, and substantially outperforms both LRU and CLOCK . The algorithm CAR is self-tuning and requires no user-specified magic parameters.

Least recently used
Replace a page whose most recent request was earliest. The least recently used page (LRU) replacement algorithm, though similar in name to NRU, differs in the fact that LRU keeps track of page usage over a short period of time, while NRU just looks at the usage in the last clock interval. LRU works on the idea that pages that have been most heavily used in the past few instructions are most likely to be used heavily in the next few instructions too. While LRU can provide near-optimal performance in theory (almost as good as Adaptive Replacement Cache), it is rather expensive to implement in practice. There are a few implementation methods for this algorithm that try to reduce the cost yet keep as much of the performance as possible.

The most expensive method is the linked list method, which uses a linked list containing all the pages in memory. At the back of this list is the least recently used page, and at the front is the most recently used page. The cost of this implementation lies in the fact that items in the list will have to be moved about every memory reference, which is a very time-consuming process.

Another method that requires hardware support is as follows: suppose the hardware has a 64-bit counter that is incremented at every instruction. Whenever a page is accessed, it gains a value equal to the counter at the time of page access. Whenever a page needs to be replaced, the operating system selects the page with the lowest counter and swaps it out. With present hardware, this is not feasible because the OS needs to examine the counter for every page in memory.

Because of implementation costs, one may consider algorithms (like those that follow) that are similar to LRU, but which offer cheaper implementations.

One important advantage of the LRU algorithm is that it is amenable to full statistical analysis. It has been proven, for example, that LRU can never result in more than N-times more page faults than OPT algorithm, where N is proportional to the number of pages in the managed pool.

On the other hand, LRU's weakness is that its performance tends to degenerate under many quite common reference patterns. For example, if there are N pages in the LRU pool, an application executing a loop over array of N + 1 pages will cause a page fault on each and every access. As loops over large arrays are common, much effort has been put into modifying LRU to work better in such situations. Many of the proposed LRU modifications try to detect looping reference patterns and to switch into suitable replacement algorithm, like Most Recently Used (MRU).

LRU is a marking algorithm, so it is $$ \dfrac{k}{k-h+1}$$-competitive.

Variants on LRU

 * 1) LRU-K improves greatly on LRU with regard to locality in time. It's also known as LRU-2, for the case that K=2. LRU-1 (i.e. K=1) is the same as normal LRU.
 * 2) The ARC algorithm extends LRU by maintaining a history of recently evicted pages and uses this to change preference to recent or frequent access. It is particularly resistant to sequential scans.

A comparison of ARC with other algorithms (LRU,MQ,2Q,LRU-2,LRFU,LIRS) can be found in Megiddo & Modha.

Not frequently used
Replace a page that has been used least time when it was in cache.

The not frequently used (NFU) page replacement algorithm requires a counter, and every page has one counter of its own which is initially set to 0. At each clock interval, all pages that have been referenced within that interval will have their counter incremented by 1. In effect, the counters keep track of how frequently a page has been used. Thus, the page with the lowest counter can be swapped out when necessary.

The main problem with NFU is that it keeps track of the frequency of use without regard to the time span of use. Thus, in a multi-pass compiler, pages which were heavily used during the first pass, but are not needed in the second pass will be favoured over pages which are comparably lightly used in the second pass, as they have higher frequency counters. This results in poor performance. Other common scenarios exist where NFU will perform similarly, such as an OS boot-up. Thankfully, a similar and better algorithm exists, and its description follows.

The not frequently used page-replacement algorithm generates fewer page faults than the least recently used page replacement algorithm when the page table contains null pointer values.

Random
Replace a random page in memory. Random replacement algorithm replaces a random page in memory. This eliminates the overhead cost of tracking page references. Usually it fares better than FIFO, and for looping memory references it is better than LRU, although generally LRU performs better in practice. OS/390 uses global LRU approximation and falls back to random replacement when LRU performance degenerates, and the Intel i860 processor used a random replacement policy (Rhodehamel 1989).

The competitive ratio of random is $$ \dfrac{k}{k-h+1}$$ against an adaptive-online adversary. And it is at least $$ \dfrac{k}{k-h+1}$$ competitive against an oblivious adversary.

Randomized mark algorithm
Request_Page(P): If p is in cache If p is unmarked Mark p   else if all pages in cache are marked unmark all replace a page randomly from unmarked page in cache Randomized mark algorithm is a kind of marking algorithm. Initially, it sets all pages marked. When a request for a page p that is in the cache but not marked, it marks p. Otherwise, if p is not in the cache, then p is swapped into the cache and set as marked. At the same time, the algorithm will randomly choose a page that is unmarked to swapped out. If all pages are marked in the cache then they are all unmarked first. The analysis upon randomized mark algorithm is complicated. It is $$2H_k$$ competitive against an oblivious adversary. $$H_k$$ is the kth harmonic number. Specifically, if the total number of pages is $$n$$ and $$n=k+1$$, the algorithm is  $$H_k$$competitive against an oblivious adversary.

Issues in Operating System
In practical, besides the algorithms, some issues are usually considered from the system perspective.

To increase performance, paging systems may employ various extra strategies to the basic paging algorithms. such like pre-cleaning and Anticipatory paging. See paging for reference. In addition, Working set is a common rationale which prevents thrashing while keeping the degree of multiprogramming as high as possible.