Petascale computing

Petascale computing refers to computing systems capable of calculating at least 1015 floating point operations per second (1 petaFLOPS). Petascale computing allowed faster processing of traditional supercomputer applications. The first system to reach this milestone was the IBM Roadrunner in 2008. Petascale supercomputers were succeeded by exascale computers.

Definition
Floating point operations per second (FLOPS) are one measure of computer performance. FLOPS can be recorded in different measures of precision, however the standard measure (used by the TOP500 supercomputer list) uses 64 bit (double-precision floating-point format) operations per second using the High Performance LINPACK (HPLinpack) benchmark.

The metric typically refers to single computing systems, although can be used to measure distributed computing systems for comparison. It can be noted that there are alternative precision measures using the LINPACK benchmarks which are not part of the standard metric/definition. It has been recognised that HPLinpack may not be a good general measure of supercomputer utility in real world application, however it is the common standard for performance measurement.

History
The petaFLOPS barrier was first broken on 16 September 2007 by the distributed computing Folding@home project. The first single petascale system, the Roadrunner, entered operation in 2008. The Roadrunner, built by IBM, had a sustained performance of 1.026 petaFLOPS. The Jaguar became the second computer to break the petaFLOPS milestone, later in 2008, and reached a performance of 1.759 petaFLOPS after a 2009 update.

By 2018, Summit had become the world's most powerful supercomputer, at 200 petaFLOPS before Fugaku reached 415 petaFLOPS in June 2020.

By 2024, Frontier was the most powerful supercomputer in the world at 1,194 petaFLOPS, making it the only exascale supercomputer in the world.

Artificial intelligence
Modern artificial intelligence (AI) systems require large amounts of computational power to train model parameters. OpenAI employed 25,000 NVIDIA A100 GPUs to train GPT-4, using 133 trillion floating point operations.