Massively parallel processor array

A massively parallel processor array, also known as a multi purpose processor array (MPPA) is a type of integrated circuit which has a massively parallel array of hundreds or thousands of CPUs and RAM memories. These processors pass work to one another through a reconfigurable interconnect of channels. By harnessing a large number of processors working in parallel, an MPPA chip can accomplish more demanding tasks than conventional chips. MPPAs are based on a software parallel programming model for developing high-performance embedded system applications.

Architecture
MPPA is a MIMD (Multiple Instruction streams, Multiple Data) architecture, with distributed memory accessed locally, not shared globally. Each processor is strictly encapsulated, accessing only its own code and memory. Point-to-point communication between processors is directly realized in the configurable interconnect.

The MPPA's massive parallelism and its distributed memory MIMD architecture distinguishes it from multicore and manycore architectures, which have fewer processors and an SMP or other shared memory architecture, mainly intended for general-purpose computing. It's also distinguished from GPGPUs with SIMD architectures, used for HPC applications.

Programming
An MPPA application is developed by expressing it as a hierarchical block diagram or workflow, whose basic objects run in parallel, each on their own processor. Likewise, large data objects may be broken up and distributed into local memories with parallel access. Objects communicate over a parallel structure of dedicated channels. The objective is to maximize aggregate throughput while minimizing local latency, optimizing performance and efficiency. An MPPA's model of computation is similar to a Kahn process network or communicating sequential processes (CSP).

Applications
MPPAs are used in high-performance embedded systems and hardware acceleration of desktop computer and server applications, such as video compression, image processing, medical imaging, network processing, software-defined radio and other compute-intensive streaming media applications, which otherwise would use FPGA, DSP and/or ASIC chips.

Examples
MPPAs developed in companies include ones designed at: Ambric, PicoChip, Intel, IntellaSys, GreenArrays, ASOCS, Tilera, Kalray, Coherent Logix, Tabula, and Adapteva. Aspex (Ericsson) Linedancer differs in that it was a Massive wide SIMD Array rather than an MPPA. Strictly speaking it could qualify as SIMT due to all 4096 of the 3,000 gate cores having its own Content-Addressable Memory.

Fabricated MPPAs developed in universities include: 36-core and 167-core Asynchronous Array of Simple Processors (AsAP) arrays from the University of California, Davis, 16-core RAW from MIT, and 16-core and 24-core arrays from Fudan University.

The Chinese Sunway project developed their own 260-core SW26010 manycore chip for the TaihuLight supercomputer, which is as of 2016 the world's fastest supercomputer.

Anton 3 processors, designed by D. E. Shaw Research for molecular dynamics simulations, contain arrays of 576 processors arranged in a 12×24 tiled grid of pairs of cores; a routed network links these tiles together and extends off-chip to other nodes in a full system.