Lazy FP state restore

Lazy FPU state leak, also referred to as Lazy FP State Restore or LazyFP, is a security vulnerability affecting Intel Core CPUs. The vulnerability is caused by a combination of flaws in the speculative execution technology present within the affected CPUs and how certain operating systems handle context switching on the floating point unit (FPU). By exploiting this vulnerability, a local process can leak the content of the FPU registers that belong to another process. This vulnerability is related to the Spectre and Meltdown vulnerabilities that were publicly disclosed in January 2018.

It was announced by Intel on 13 June 2018, after being discovered by employees at Amazon, Cyberus Technology and SYSGO.

Besides being used for floating point arithmetic, the FPU registers are also used for other purposes, including for storing cryptographic data when using the AES instruction set, present in many Intel CPUs. This means that this vulnerability may allow for key material to be compromised.

Mechanism
The floating point and SIMD registers are large, and not used by every task (or thread) in the system. To make context switching faster, most common microprocessors support lazy state switching. Rather than storing the full state during a context switch, the operating system can simply mark the FPU "not available" in the hopes that the switched-to task will not need it. If the operating system has guessed correctly, time is saved. If the guess is wrong, the first FPU or SIMD instruction will cause a trap to the operating system, which can then save the state to the previous task and load the correct state for the current task.

In out-of-order CPUs, the "FPU not available" condition is not detected immediately. (In fact, it almost cannot be detected immediately, as there may be multiple fault-causing instructions executing simultaneously and the processor must take the first fault encountered to preserve the illusion of in-order execution. The information about which is first is not available until the in-order retire stage.)  The processor speculatively executes the instruction using the previous task's register contents, and some following instructions, and only later detects the FPU not available condition. Although all architectural state is reverted to the beginning of the faulting instruction, it is possible to use part of the FPU state as the address in a memory load, triggering a load into the processor's cache. Exploitation then follows the same pattern as all Spectre-family vulnerabilities: as the cache state is not architectural state (the cache only affect speed, not correctness), the cache load is not undone and the address, including part of the previous task's register state, can later be detected by measuring the time taken to access different memory addresses.

It is possible to exploit this bug without actually triggering any operating system traps. By placing the FPU access in the shadow of a forced branch misprediction (e.g. using a retpoline) the processor will still speculatively execute the code, but will rewind to the mispredicted branch and never actually execute the operating system trap. This allows the attack to be rapidly repeated, quickly reading out the entire FPU and SIMD register state.

Mitigation
It is possible to mitigate the vulnerability at the operating system and hypervisor levels by always restoring the FPU state when switching process contexts. With such a fix, no firmware upgrade is required. Some operating systems already did not lazily restore the FPU registers by default, protecting those operating systems on affected hardware platforms, even if the underlying hardware issue existed. On Linux operating system using kernel 3.7 or higher, it is possible to force the kernel to eagerly restore the FPU registers by using the  kernel parameter. Also, many system software vendors and projects, including Linux distributions, OpenBSD, and Xen have released patches to address the vulnerability.