Intel 8231/8232

The Intel 8231 and 8232 were early designs of floating-point maths coprocessors (FPUs), marketed for use with their i8080 line of primary CPUs. They were licensed versions of AMD's Am9511 and Am9512 FPUs, from 1977 and 1979, themselves claimed by AMD as the world's first single-chip FPU solutions.

Adoption
Whilst the i8231/i8232 (and their AMD-branded cousins) were primarily intended to partner the i8080 (or the AMD clone Am9080), the multiple interface options in their design, from simple wait state insertion and status polling routines to interrupt and DMA controller driven methods suitable for a peripheral processor or add-in board, meant that – with a small amount of glue logic – it was usable in almost any microprocessor system that had a DMA subsystem or a spare interrupt input/interrupt vector available, and AMD's original documentation provided several different examples. This was a valuable feature for one of the first commercially available single-chip FPUs, greatly broadening its potential market, and was in stark contrast to Intel's succeeding, in-house designed 8087 (and other x87 family) FPUs which were tightly bound to the x86 CPU line. For example, the i8231A was used in the Applied Analytics MicroSPEED II and II+ accelerator cards for the 6502-based Apple II line, but examples were also given for the Z80, MC6800, i8085, and even the 16-bit Z8000. Additionally, prior to the introduction of the 8087, Intel's own preliminary datasheets suggested the chips as suitable companions for the then-new 8086.

Capacity
The Intel 8231 (and revised 8231A) is the Arithmetic Processing Unit (APU). It offered 32-bit "double" precision (a term later and more commonly used to describe 64-bit floating-point numbers, whilst 32-bit is considered "single" precision) floating-point, and 16-bit or 32-bit ("single" or "double" precision) fixed-point calculation of 14 different arithmetic and trigonometric functions to a proprietary standard. The APU used the Chebyshev polynomials using the algorithms provided here. The available APU version of 4-MHz was for USD $235.00 and 2-MHz was for USD $149.00 in quantities of 100 or more. The later Intel 8232 is the Floating Point Processor Unit (FPU). It performed 32-bit or 64-bit (true single- and double-precision) floating point calculations compliant with the (draft) IEEE-754 standard (as used by the i80387 and other later FPUs), but only on the four primary arithmetic functions (addition, subtraction, multiplication and division). The available FPU version of 4-MHz was for USD $235.00 and 2-MHz was for USD $149.00 in quantities of 100 or more.

All three chips used an 8-bit data bus design, in line with the i8080 and most other contemporary microprocessors. The 8231 could run at up to 3 MHz, and the 8231A and 8232 up to 4 MHz (a slight improvement on the Am9512 which was limited to 3 MHz ), either in sync with the CPU or (in the 8231A and 8232) asynchronously depending on the degree of bus separation in the host system. Async operation was a useful addition to the feature set, as it allowed e.g. a roughly 1 MHz Apple II system to be expanded with a 4 MHz 8231A and enjoy the benefit of much faster numeric processing than it may otherwise have been limited to, or a 5 MHz i8085-based system to host an 8231A or 8232 without itself having to be slowed to 4 MHz or less to maintain compatibility. It also, along with the interrupt driven peripheral design, allowed a degree of parallel processing between the CPU and FPU, with the former resuming its own normal processing after passing commands and data to the essentially "offboard" coprocessor, only switching back to the floating-point subtask (to receive results and optionally issue further commands) when signalled by the copro that processing was complete. This parallelisation was vital to improving overall system throughput, when some of the more complex functions could still take the FPU several milliseconds to complete – an eternity in computing terms.

Instruction execution times were very variable and, as an early generation design, typically much longer than those seen in later, more evolved FPUs. For example, ignoring data and stack handling instructions, on the 8232 they ranged from 56 clock periods for a single-precision (32-bit) subtraction, to a full 4560 periods for a double-precision (64-bit) divide, for an effective processing speed (if clocked at 4 MHz) of 877 to 71429 FLOPS. The 8231(A)'s instructions ranged from 17 periods for a 16-bit fixed-point addition, through 98 to 378 periods for common 32-bit float operations (heavily dependent not only on the function itself, but the actual magnitude of the operands and result, and even the number of "1" bits in each number), to as many as 12032 periods for a maximally complex "power" calculation, giving 332, through 10.6k ~ 40.8k, to 235.3k FLOPS of performance (at 4 MHz) depending on the instruction and data mix. Whilst these numbers may seem low from a modern perspective, they compare reasonably well with the successor i8087 (whose bigger advantages were a wider databus and address range – allowing faster transfers in and out of a larger memory space, greater numeric precision, expanded instruction/function set and near-IEEE-754 compliance), and were radically faster compared to performing the same calculations using software emulation on a regular CPU – even a relatively sophisticated, 16-bit 8086 running at a full 8 MHz could only achieve somewhere between a few dozen, to no more than around 1000 FLOPS without a coprocessor, and its slower clocked, 8-bit predecessors and rivals would have fared even worse.