Steamroller (microarchitecture)

AMD Steamroller Family 15h is a microarchitecture developed by AMD for AMD APUs, which succeeded Piledriver in the beginning of 2014 as the third-generation Bulldozer-based microarchitecture. Steamroller APUs continue to use two-core modules as their predecessors, while aiming at achieving greater levels of parallelism.

Microarchitecture
Steamroller still features two-core modules found in Bulldozer and Piledriver designs called clustered multi-thread (CMT), meaning that one module is marketed as a dual-core processor. The focus of Steamroller is for greater parallelism. Improvements center on independent instruction decoders for each core within a module, 25% more of the maximum width dispatches per thread, better instruction schedulers, improved perceptron branch predictor, larger and smarter caches, up to 30% fewer instruction cache misses, branch misprediction rate reduced by 20%, dynamically resizable L2 cache, micro-operations queue, more internal register resources and improved memory controller.

AMD estimated that these improvements will increase instructions per cycle (IPC) up to 30% compared to the first-generation Bulldozer core while maintaining Piledriver's high clock rates with decreased power consumption. The final result was a 9% single-threaded IPC improvement, and 18% multi-threaded IPC improvement over Piledriver.

Steamroller, the microarchitecture for CPUs, as well as Graphics Core Next, the microarchitecture for GPUs, are paired together in the APU lines to support features specified in Heterogeneous System Architecture.

History
In 2011, AMD announced a third-generation Bulldozer-based line of processors for 2013, with Next Generation Bulldozer as the working title, using the 28 nm manufacturing process.

On 21 September 2011, leaked AMD slides indicated that this third generation of Bulldozer core was codenamed Steamroller.

In January 2014, the first Kaveri APUs became available.

Starting from May 2015 till March 2016 new APUs were launched as Kaveri-refresh (codenamed Godavari).

APU lines
In 2015 and 2016 new models with two to four enhanced Steamroller B cores were released as Kaveri-refresh / Godavari. A10-7890K, the new top-of-the-line model, features an increased core frequency of 4.1 GHz and an 866 MHz GPU.
 * 1) Kaveri A-series APU
 * 2) * Desktop budget and mainstream markets (FM2+): The Trinity / Richland APU line was replaced in January 2014 by the Kaveri APU line, as the third generation of A10, A8, A6 and A4 series for the desktop market. Top-of-the-line model in 2014 was the quad-core A10-7850K APU, with a 3.7 GHz core frequency and 4 MB L2 cache, incorporating a 720 MHz GPU with 512 stream processors and over 856 GFLOPS of total processing power.
 * 1) * Two or four CPU cores based on the Steamroller microarchitecture
 * 2) * Socket FM2+-only, Socket FM2 is not supported, support for PCIe 3.0
 * 3) * DDR3 Dual-channel (2x64-bit) memory controller
 * 4) * AMD Heterogeneous System Architecture (HSA) 2.0
 * 5) * SIP blocks: Unified Video Decoder, Video Coding Engine, TrueAudio
 * 6) * Three to eight Compute Units (CUs) based on the revised GCN 2nd gen microarchitecture; 1 Compute Unit (CU) consists of 64 Unified Shader Processors : 4 Texture Mapping Units (TMUs) : 1 Render Output Unit (ROP)
 * 7) * AMD Eyefinity up to 4 monitors, 4K Ultra HD support, DisplayPort 1.2 Support
 * 8) * Select models support AMD Hybrid Graphics by using a Radeon R7 240 or R7 250 discrete graphics card.
 * 9) * Integrated custom ARM Cortex-A5 co-processor with TrustZone Security Extensions
 * 10) Berlin APU - canceled
 * 11) * Announced in 2013 by AMD the Berlin APU were targeted at the enterprise and server markets featuring four Steamroller cores, up to 512 stream processors and support for ECC memory.

FX lines (discontinued)
In November 2013 AMD confirmed it would not update the FX series in 2014, neither its Socket AM3+ version, nor will it receive a Steamroller version with a new socket.

AMD however, released a Kaveri based FX-770K for desktop and FX-7600P for mobile which are basically APUs with their integrated graphics disabled similar to the Athlon X4 FM2+ line. Those APUs were released for OEMs only.

Server lines (canceled)
AMD's server roadmaps for 2014 showed:
 * Berlin APU - quad-core x86 Steamroller architecture (as described above) for 1 Processor (1P) compute and media clusters
 * Berlin CPU - quad-core x86 Steamroller architecture for 1P web and enterprise services clusters
 * Seattle CPU - 4/8 core AArch64 Cortex-A57 architecture (Opteron A1100) for 1P web and enterprise services clusters
 * Warsaw CPU - up to 16 core x86 Piledriver (2nd gen Bulldozer) architecture (Opteron 6338P and 6370P) for 2P/4P servers

However, plans for Steamroller Opteron products were cancelled, likely due to the poor energy efficiency achieved in this generation of the Bulldozer architecture. Energy efficiency was greatly increased in the following generation, Excavator, which exceeded Jaguar in performance per watt, and approximately doubled performance/watt over Steamroller (for example 20.74 pt/W vs 10.85 pt/W when comparing similar mobile APUs using rough arbitrary metrics).