FeiTeng

FeiTeng (飞腾, fēiténg) is the name of several computer central processing units designed and produced in China for supercomputing applications. The microprocessors have been developed by Tianjin Phytium Technology. The processors have also been described as the YinHeFeiTeng (银河飞騰, YHFT) family. This CPU family has been developed by a team directed by NUDT's Professor Xing Zuocheng.

Initial designs
The first generation was binary compatible with the Intel Itanium 2. The second generation, the FT64, was a system on a chip with CPU and 64-bit stream processor. FT64 chips were used in YinHe (银河) supercomputers as accelerators.

FeiTeng-1000
The FeiTeng-1000 is the third generation CPU in the family. It is manufactured with 65 nm technology and contains 350 million gates. Its clock frequency is 0.8–1 GHz. It is compatible with the SPARCv9 instruction set architecture.

Each chip contains 8 cores and is capable of executing 64 threads. There are 3 HyperTransport channels for coherent links, 4 DDR3 memory controllers and a 8x PCIe 2.0 link.

The Tianhe-1A supercomputer uses 2,048 FeiTeng 1000 processors. Tianhe-1A has a theoretical peak performance of 4.701 petaflops, also employing 7,168 Nvidia Tesla M2050 GPUs and 14,336 Intel Xeon X5670 CPUs in addition to FT1000 processors. The FeiTeng-1000 is an eight-core processor based on the SPARC system and is used to operate service nodes on the Tianhe-1.

A 2012 report for the European High Performance Computing service stated that FeiTeng used the work of the OpenSPARC project.

Galaxy FT-1500
The Tianhe-2 supercomputer uses 4096 Galaxy FT-1500 processors with 16 cores, OpenSPARC architecture based and 65 W TDP. They are made with 40 nm technology, processor cores work at 1.8 GHz. Peak performance of FT-1500 is 115–144 GFLOPS; every core may execute up to 8 interleaving threads and supports 256-bit wide SIMD vector operations including Fused Mul-Add (FMA). Cache of this SoC works at 2 GHz frequency, there are 16 KB L1i, 16 KB L1d, 512 KB L2 per core, and shared 4 MB L3 cache. L3 cache has 4 segments (1 segment per block of 4 CPU cores), each of 1 MB with 32-way associative. Cache uses directory-based cache coherency protocol. FT-1500 also has:


 * Links to connect several processors into NUMA machine
 * 4 integrated DDR3 memory controllers
 * 2 PCI-express controllers
 * 10 Gbit Ethernet ports

FT-1500A
FT-1500A is an ARM64 SoC designed by Phytium, which includes 16 cores of ARMv8 processor, a 32-lane PCIe host, 2 GMAC on-chip ethernet controller and a GICv3 interrupt controller with ITS support.

Future processors
In 2020, Feiteng announced availability of the S2500 processor and a roadmap for the following years.