HAL SPARC64

SPARC64 is a microprocessor developed by HAL Computer Systems and fabricated by Fujitsu. It implements the SPARC V9 instruction set architecture (ISA), the first microprocessor to do so. SPARC64 was HAL's first microprocessor and was the first in the SPARC64 brand. It operates at 101 and 118 MHz. The SPARC64 was used exclusively by Fujitsu in their systems; the first systems, the Fujitsu HALstation Model 330 and Model 350 workstations, were formally announced in September 1995 and were introduced in October 1995, two years late. It was succeeded by the SPARC64 II (previously known as the SPARC64+) in 1996.

Description
The SPARC64 is a superscalar microprocessor that issues four instructions per cycle and executes them out of order. It is a multichip design, consisting of seven dies: a CPU die, MMU die, four CACHE dies and a CLOCK die.

CPU die
The CPU die contains the majority of logic, all of the execution units and a level 0 (L0) instruction cache. The execution units consist of two integer units, address units, floating-point units (FPUs), memory units. The FPU hardware consists of a fused multiply add (FMA) unit and a divide unit. But the FMA instructions are really fused (that is, with a single rounding) only as of SPARC64 VI. The FMA unit is pipelined and has a four-cycle latency and a one-cycle-throughput. The divide unit is not pipelined and has significantly longer latencies. The L0 instruction cache has a capacity of 4 KB, is direct-mapped and has a one-cycle latency.

The CPU die is connected to the CACHE and MMU dies by ten 64-bit buses. Four address buses carrying virtual addresses lead out to each cache die. Two data buses write data from the register file to the two CACHE dies that implement the data cache. Four buses, one from each CACHE die, deliver data or instructions to the CPU.

The CPU die contained 2.7 million transistors, has dimensions of 17.53 mm by 16.92 mm for an area of 297 mm2 and has 817 signal bumps and 1,695 power bumps.

MMU die
The MMU die contains the memory management unit, cache controller and the external interfaces. The SPARC64 has separate interfaces for memory and input/output (I/O). The bus used to access the memory is 128 bits wide. The system interface is the HAL I/O (HIO) bus, a 64-bit asynchronous bus. The MMU has a die area of 163 mm2.

Cache dies
Four dies implement the level 1 (L1) instruction and data caches, which require two dies each. Both caches have a capacity of 128 KB. The latency for both caches is three cycles, and the caches are four-way set associative. The data cache is protected by error correcting code (ECC) and parity. It uses a 128-byte line size. Each CACHE die implements 64 KB of the cache and a portion of the cache tags.

The cache die contains 4.3 million transistors, has dimensions of 14.0 mm by 10.11 mm for a die area of 142 mm2. It has 1,854 solder bumps, of which 446 are signals and 1408 are power.

Physical
The SPARC64 consisted of 21.9 million transistors. It was fabricated by Fujitsu in their CS-55 process, a 0.40 μm, four-layer metal complementary metal–oxide–semiconductor (CMOS) process. The seven dies are packaged in a rectangular ceramic multi-chip module (MCM), connected to the underside of the MCM with solder bumps. The MCM has 565 pins, of which 286 are signal pins and 218 are power pins, organized as a pin grid array (PGA). The MCM has wide buses which connect the seven dies.

SPARC64 II
The SPARC64 II (SPARC64+) was a further development of the SPARC64. It is a second-generation SPARC64 microprocessor. It operated at 141 and 161 MHz. It was used by Fujitsu in their HALstation Model 375 (141 MHz) and Model 385 (161 MHz) workstations, which were introduced in November 1996 and December 1996, respectively. The SPARC64 II was succeeded by the SPARC64 III in 1998.

The SPARC64 II has higher performance due to higher clock frequencies enabled by the new process and circuit tweaks; and a higher instructions per cycle (IPC) count due to the following microarchitecture improvements:
 * The capacity of the level 0 (L0) instruction cache was doubled to 8 KB.
 * The number of physical registers was increased to 128 from 116 and the number of register files to five from four.
 * The number of branch history table entries was doubled to 2,048.

It was fabricated by Fujitsu in their CS-60 process, a 0.35 μm, five-layer metal CMOS process. The new process reduced the area of the dies, with the CPU die measuring 202 mm2, the MMU die 103 mm2, and the CACHE die 84 mm2.

SPARC64 GP
The SPARC64 GP is a series of related microprocessors developed by HAL and Fujitsu used in the Fujitsu GP7000F and PrimePower servers. The first SPARC64 GP was a further development of the SPARC64 II. It was a third-generation SPARC64 microprocessor and was known as the SPARC64 III before it was introduced. The SPARC64 GP operated at clock frequencies of 225, 250 and 275 MHz. It was the first microprocessor from HAL to support multiprocessing. The main competitors were the HP PA-8500, IBM POWER3 and Sun UltraSPARC II. The SPARC64 GP was taped out in July 1997. It was announced on 11 April 1998, with 225 and 250 MHz versions were introduced in December 1998. A 275 MHz version was introduced in March 1999.

It was a single-die implementation of the SPARC64 II that integrated, with modifications, the CPU die and two of the four CACHE dies. Numerous modifications and improvements were made to the microarchitecture, such as the replacement of the MMU and a new system interface using the Ultra Port Architecture.

It had improved branch prediction, an extra pipeline stage to improve clock frequencies and a second FPU which could execute add and subtract instructions. A FPU of less functionality was added instead of a duplicate of the first to save die area; the second FPU is half the size of the first. It has a three-cycle latency for all instructions. The complex SPARC64 II memory management unit (MMU) was replaced with a simpler one that is compatible with the Solaris operating system. Previously, SPARC64 systems ran SPARC64/OS, a derivative of Solaris developed by HAL that supported the SPARC64.

The L1 caches were halved in capacity to 64 KB from 128 KB to reduce die area (the reason why only two of the four CACHE dies were integrated from the SPARC64 II). The associated performance loss was mitigated by the provision of a large external L2 cache with a capacity of 1 to 16 MB. The L2 cache is accessed with a dedicated 128-bit data bus that operates at the same or half clock frequency of the microprocessor. The L2 cache is inclusive, that is it is a super-set of the L1 caches. Both L1 and L2 cache have their data protected by ECC and their tags by parity.

The SPARC64 II's proprietary system interface was replaced by one compatible with the Ultra Port Architecture. This enabled the SPARC64 III to use chipsets from Sun Microelectronics. The system bus operates at half, a third, quarter or fifth the frequency of the microprocessor, up to a maximum of 150 MHz.

It contained 17.6 million transistors, of which 6 million are for logic and 11.6 million are contained in the caches and TLBs. The die has an area of 210 mm2. It was fabricated by Fujitsu in their CS-70 process, a 0.24 μm, five-layer metal, CMOS process. It is packaged in a 957-pad flip-chip land grid array (LGA) package with dimensions of 42.5 mm by 42.5 mm. Of the 957 pads, 552 are for signals and 405 are for power and ground.

Internal voltage is 2.5 V, I/O voltage is 3.3 V. Peak power consumption of 60 W at 275 MHz. The Ultra Port Architecture (UPA) signals are compatible with 3.3 V Low Voltage Transistor Transistor Logic (LVTTL) levels with the exception of differential clock signals which are compatible with 3.3 V pseudo emitter coupled logic (PECL) levels.

Later versions
The second and third SPARC64 GPs are fourth generation SPARC64 microprocessors. The second SPARC64 GP was a further development of the first and it operated at  400 to 563 MHz. The first versions, operating at 400 and 450 MHz were introduced on 1 August 2000. It had larger L1 instruction and data caches, doubled in capacity to 128 KB each; better branch prediction as the result of a larger BHT consisting of 16,384 entries; support for the Visual Instruction Set (VIS); and a L2 cache built from double data rate (DDR) SRAM. It contained 30 million transistors and was fabricated by Fujitsu in their CS80 process, a 0.18 μm CMOS process with six levels of copper interconnect. It used a 1.8 V internal power supply and a 2.5 or 3.3 V power supply for I/O. It was packaged in a 1,206-contact ball grid array (BGA) measuring 37.5 mm by 37.5 mm. of the 1,206 contacts, 552 are signals and 405 are power or ground.

The third SPARC64 GP was identical to the second in terms of microarchitecture. It operated at 600 to 810 MHz. First versions were introduced in 2001. 700, 788 and 810 MHz versions introduced on 17 July 2002. It was fabricated by Fujitsu in their 0.15 μm CS85 process with six levels of copper interconnect. It used a 1.5 V internal power supply and a 1.8 or 2.5 V power supply for I/O.