Intel 5-level paging

Intel 5-level paging, referred to simply as 5-level paging in Intel documents, is a processor extension for the x86-64 line of processors. It extends the size of virtual addresses from 48 bits to 57 bits by adding an additional level to x86-64's multilevel page tables, increasing the addressable virtual memory from 256 TB to 128 PB. The extension was first implemented in the Ice Lake processors.

Technology
In the 4-level paging scheme (previously known as IA-32e paging), the 64-bit virtual memory address is divided into five parts. The lowest 12 bits contain the offset within the 4 KiB memory page, and the following 36 bits are evenly divided between the four 9 bit descriptors, each linking to a 64-bit page table entry in a 512-entry page table for each of the four paging levels. This makes it possible to use bits 0 through 47 in the virtual address, for a total of 256 TB.

5-level paging adds another 9 bit page table descriptor, making it possible to use bits 0 through 56. This multiplies the address space by 512 and increases the limit to 128 PB.

With 5-level paging enabled, bits 57 through 63 must be copies of bit 56. This is the same as with 4-level paging, where the high-order bits of a virtual address that do not participate in address translation must be the same as the most significant implemented bit. The 5-level paging is enabled by setting bit 12 of the CR4 register (known as LA57). This is only used when the processor is operating in 64 bit mode, and only may be modified when it is not. If the bit is not set, or the 5-level paging feature is not supported, the processor uses the 4-level page table structure when operating in 64-bit mode. This is similar to Physical Address Extension (PAE), where the third level of paging tables to allow 36-bit addressing was enabled by setting a bit in the CR4 register.

Future processors may allow full 64-bit virtual address space by extending the size of page table descriptors to 12 bits (4096 page table entries) and memory offset to 16 bits (64 KiB page size) in the 4-level paging scheme or 21 bits (2 MiB page size) in the 5-level scheme. Extending page table entry size from 64 to 128 bits would allow arbitrary page sizes, as additional hardware flags would change the size and operation of descriptors on lower paging levels.

Drawbacks
Adding another level of indirection makes page table "walks" longer. A page table walk occurs when either the processor's memory management unit or the memory management code in the operating system navigates the tree of page tables to find the page table entry corresponding to a virtual address. This means that, in the worst case, the processor or the memory manager has to access physical memory six times for a single virtual memory access, rather than five for the previous iteration of x86-64 processors. This results in slightly reduced memory access speed. In practice this cost is greatly mitigated by caches such as the translation lookaside buffer (TLB). Future extensions may reduce page walks by limiting virtual address space per application, with dedicated hardware flags in an extended 128 bit page table entry, and allowing a larger 64 KiB or 2 MiB page sizes and backward compatibility with 4 KB page operations.

Implementation
5-level paging is implemented by the Ice Lake microarchitecture, EPYC 9004 and 8004 Series Processors and Storm peak Ryzen Threadripper PRO 7900WX series.

The 4.14 Linux kernel adds support for it. Support for the extension was submitted as a set of patches to the Linux kernel on 8 December 2016. As was reported on the Linux kernel mailing list, it consisted of extending the Linux memory model to use five levels rather than four. This is because, although Linux abstracts the details of the page tables, it still depends on having a number of levels in its own representation. When an architecture supports fewer levels, Linux emulates extra levels that do nothing. A similar change was previously made to extend from three levels to four.

Windows 10 and 11 with server versions also support this extension in their latest updates, where it is provided by a separate kernel image called ntkrla57.exe.