User:Jfaghihnassiri/Final Draft Page

The Teraflops Research Chip (also called Polaris) is a research processor containing 80 cores developed by Intel Corporation’s Tera-Scale Computing Research Program. The processor was officially announced February 11, 2007 and shown working at the 2007 International Solid-State Circuits Conference. Features of the processor include duel floating point engines, sleeping-core technology, self correction, fixed-function cores, and three dimensional memory stacking. The purpose of the chip is to explore the possibilities of Tera-Scale architecture (the process of creating processors with more than four cores) and to experiment with various forms of networking and communication within the next generation of processors.

=Features=

The processor consists of 80 individual cores on a single chip. The cores are different from the cores used in today’s main stream dual or quad core processors in that they are much simpler in design. The same parts and ideas that went into constructing today's generation of processors were used in the new processor. These parts and ideas are simply reconstructed in a fashion which defines the new tera-scale era of processor architecture and allow for more then four cores to function on one chip.

Dual floating point engines
Each of the cores onboard the teraflops research chip contains two floating point engines. Unlike integer engines, the floating point engines are more powerful in that they have the ability to make accurate calculations for graphic uses as well as financial and scientific modeling.

Energy efficiency
The new tera-scale technology which allows for so many cores to be integrated on one chip also allows for better load distribution and a decreased chance of overheating. If a core is overloaded then the heat produced by that core increases, which reflects a decrease in efficiency and a waste of energy. In the teraflops research chip, if some of the cores are being overloaded, that load can just be delegated to other cores, resulting in a load distribution which does not create as much heat. The processor introduces a notion of sleeping cores. To further power efficiency and optimize the ratio between computing usage and power consumption, cores that are not in use or are not needed will sleep. In other words they will not be powered or operational other than to perform their communication duties.

Core Communication
Along with 80 cores, the chip also contains 80 routers. Each core has a dedicated router which is responsible for the communication of that core with all other cores and components of the processor. The router uses a five port system with 1 port going to each of the surrounding cores and one going to the DRAM (the processors local memory). The chip is laid out in an 8 core by 10 core format. Each of the 8 cores in any of the 10 rows, called nodes, has the ability to communicate directly with other cores within the same node. Communication between nodes and to other processor components is directed through a routing system. The on-die interconnect fabric which the cores use to communicate with each other is currently being researched. One option being considered is the ring topology, which consists of various sized ring networks being integrated within each other to connect the cores. A more flexible and likely solution is the mesh topology in which the cores will be connected on a grid layout.

Self correction
The processor allows for the use of a self correction system. If one core is somehow damaged, worn out, or unable to function, it can delegate all of its workload permanently to another core without the need to edit the software interacting with the processor.

Fixed function cores
With tera-scale technology processors such as the teraflops research chip can dedicate cores to certain functions. The number of cores dedicated and the functions that they are dedicated to will be dependent on the use of the processor. Functions include graphics, networking, security, and more.

Memory stacking
In the processor the DRAM is stacked directly above the cores, with the fifth port from the router connecting to it. This sort of DRAM connection is relatively new compared to the old technologies of having the memory be next to the cpu on the die, or embedded within the die. The speed of the DRAM is one of the roadblocks to maximizing the capabilities of the processor.

=Research properties=

The idea behind making a research chip of this sort is that it allows companies to explore the possibilities of tera-scale computing. Instead of being forced to work in theory, engineers and researchers can instead work with the chip itself experimentally. The chip is a wake-up call to other computer-related industries to advance their products to meet the needs of the new computing power. Through the exploration of the 80 core processor, concepts such as the Larrabee processor, a processor that has both the CPU and GPU on one chip, can be better understood and be made more realistic. With the computing power of the teraflops research chip, technologies such as graphics virtualization and visual recognition become much more realistic.

=Software issues=

As is often the case, the development of new processor architecture is accompanied by the issue of software development. Software tends to lag behind hardware development, especially in the case of multi-core chips. Intel aims to solve this problem by creating a new programming language especially for the 80 core processor called Ct. Intel also created a Software development kitsoftware development kit to accommodate visual recognition and multi thread instructions. Both Intel and Microsoft are supporting a new age of programmers by jointly donating $20 million to the cause.

=Statistics=

=Dimensions=

The processor is constructed using a 65nm CMOS process, the die is 12.64 mm by 21.72 mm (274.5 mm²) and contains 100 million transistors. The package is connected through a 1248 pin LGA with 343 signal pins.

=References=