UltraSPARC III

The UltraSPARC III, code-named "Cheetah", is a microprocessor that implements the SPARC V9 instruction set architecture (ISA) developed by Sun Microsystems and fabricated by Texas Instruments. It was introduced in 2001 and operates at 600 to 900 MHz. It was succeeded by the UltraSPARC IV in 2004. Gary Lauterbach was the chief architect.

History
When presented at the '97 Microprocessor Forum, the probable introduction date for the UltraSPARC III was 1999, and it would have competed with Digital Equipment Corporation's Alpha 21264 and Intel's Itanium (Merced). This was not to be the case as it was delayed until 2001. Despite being late, it was awarded the Analysts' Choice Award for Best Server/Workstation Processor of 2001 by Microprocessor Report for its multiprocessing features.

Description
The UltraSPARC III is an in-order superscalar microprocessor. The UltraSPARC III was designed for shared memory multiprocessing performance, and it has several features that aid in achieving that goal: an integrated memory controller and a dedicated multiprocessing bus.

It fetches up to four instructions per cycle from the instruction cache. Decoded instructions are sent to a dispatch unit at up to six at a time. The dispatch unit issues the instructions to the appropriate execution units depending on operand and resource availability. The execution resources consisted of two arithmetic logic units (ALUs), a load and store unit and two floating-point units. One of the ALUs can only execute simple integer instructions and loads. The two floating point units are also not equal. One can only execute simple instructions such as adds while the other executes multiplies, divides and square roots.

Cache
The UltraSPARC III has split primary instruction and data caches. The instruction cache has a capacity of 32 KB. The data cache has a capacity of 64 KB and is four-way set-associative with a 32-byte cache line. The external L2 cache has a maximum capacity of 8 MB. It is accessed via a dedicated 256-bit bus operating at up 200 MHz for a peak bandwidth of 6.4 GB/s. The cache is built synchronous static random access memory clocked at frequencies up to 200 MHz. The L2 cache tags are located on-die to enable it be clocked at the microprocessor's clock frequency. This increases bandwidth for accessing the cache tags, enabling the UltraSPARC to scale to higher clock frequencies easily. Part of the increased bandwidth to the cache tags is used by cache coherency traffic, which is required in the multiprocessor systems the UltraSPARC III is designed to be used in. As the maximum capacity of L2 cache is 8 MB, the L2 cache tags is 90 KB in size.

External interface
The external interface consists of a 128-bit data bus and a 43-bit address bus operating at 150 MHz. The data bus is not used to access memory, but the memory of other microprocessors and the shared I/O devices.

Memory controller
The UltraSPARC has an integrated memory controller and implements a dedicated 128-bit bus operating at 150 MHz to access up to 4 GB of "local" memory. The integrated memory controller is used to reduce latency and thus improve performance, unlike some other UltraSPARC microprocessors that use the feature to reduce cost.

Physical
The UltraSPARC III consisted of 16 million transistors, of which 75% are contained in the caches and tags. It was initially fabricated by Texas Instruments in their C07a process, a complementary metal–oxide–semiconductor (CMOS) process with a 0.18 μm feature size and six-levels of aluminium interconnect. In 2001, it was fabricated in a 0.13 μm process with aluminium interconnects. This enabled it to operate at 750 to 900 MHz. The die is packaged using the Controlled Collapse Chip Connection method and is the first Sun microprocessor to do so. Unlike most other microprocessors bonded in such a way, the majority of the solder bumps are placed in a peripheral ring instead of being distributed across the die. It was packaged in a 1368-pad land grid array (LGA) package.

UltraSPARC III Cu
The UltraSPARC III Cu, code-named "Cheetah+", is a further development of the original UltraSPARC III that operated at higher clock frequencies of 1002 to 1200 MHz. It has a die size of 232 mm2 and was fabricated in a 0.13 μm, 7-layer copper metallization, CMOS process by Texas Instruments. It was packaged in a 1,368-pad ceramic LGA package.

UltraSPARC IIIi
The UltraSPARC IIIi, code named "Jalapeño", is a derivative of the UltraSPARC III for workstations and low-end (one to four processor) servers introduced in 2003. It operates at 1064 to 1593 MHz, has an on-die L2 cache and an integrated memory controller, and is capable of four-way multiprocessing with a glue-less system bus optimized for the function. It contains 87.5 million transistors and has a 178.5 mm2 die. It was fabricated by Texas Instruments in a 0.13 μm, seven-layer metal (copper) CMOS process with low-k dielectric.

The UltraSPARC IIIi has a unified 1 MB L2 cache that operates at half of the microprocessor's clock frequency. As such, it has a six-cycle latency and a two-cycle throughput. The load to use latency is 15 cycles. The tag store is protected by parity and the data by ECC. For every 64-byte cache line, there are 36 ECC bits, enabling the correction of one-bit errors and the detection of any error within a four bits. The cache is four-way set-associative, has a 64-byte line size and is physically indexed and tagged. It uses a 2.76 μm2 SRAM cell and consists of 63 million transistors.

The on-die memory controller supports 256 MB to 16 GB of 133 MHz DDR-I SDRAM. The memory is accessed via a 137-bit memory bus, of which 128 bits are for data and 9 are for ECC. The memory bus has a peak bandwidth of 4.2 GB/s. The microprocessor was designed to support four-way multiprocessing. Jbus is used to connect up to four microprocessors. It is a 128-bit address and data multiplexed bus that operates at one half or one third of the microprocessor's clock frequency.

UltraSPARC IIIi+
The UltraSPARC IIIi+, code-named "Serrano", was a further development of the UltraSPARC IIIi. It was scheduled for introduction in the second half of 2005, but was cancelled in the same year in favor of the UltraSPARC IV+, UltraSPARC T1 and UltraSPARC T2. Its cancellation was not known until 31 August 2006. Improvements were higher clock frequencies in the range of 2GHz, a larger (4MB) on-die L2 cache, support for DDR-333 SDRAM, and a new 90nm process.

Successors
The UltraSPARC III family of processors was succeeded by the UltraSPARC IV series.

The UltraSPARC IV combined two UltraSPARC III cores onto a single piece of silicon and offered increased clock rates. The CPU's packaging was nearly identical, offering the difference of a single pin, simplifying board manufacturing and system design. Some systems which used UltraSPARC III processors could accept UltraSPARC IV CPU board upgrades.