Template:NvidiaDgxAccelerators

Comparison of accelerators used in DGX:

{|

! Model !! Architecture !! Socket !! FP32 CUDA cores !! FP64 cores (excl. tensor) !! Mixed INT32/FP32 cores !! INT32 cores !! Boost clock !! Memory clock !! Memory bus width !! Memory bandwidth !! VRAM !! Single precision (FP32) !! Double precision (FP64) !! INT8 (non-tensor) !! INT8 dense tensor !! INT32 !! FP4 dense tensor !! FP16 !! FP16 dense tensor !! bfloat16 dense tensor !! TensorFloat-32 (TF32) dense tensor !! FP64 dense tensor !! Interconnect (NVLink) !! GPU !! L1 Cache !! L2 Cache !! TDP !! Die size !! Transistor count !! Process ! B200 ! B100 ! H200 ! H100 ! A100 80GB ! A100 40GB ! V100 32GB ! V100 16GB ! P100
 * Blackwell || N/A || N/A || N/A || N/A || N/A || N/A || 8 Gbit/s HBM3e || 8192-bit || 8 TB/sec || 192 GB HBM3e || N/A || N/A || N/A || 4.5 POPS || N/A || 9 PFLOPS || N/A || 2.25 PFLOPS || 2.25 PFLOPS || 1.2 PFLOPS || 40 TFLOPS || 1.8 TB/sec || GB100 || N/A || N/A || 1000 W || N/A || 208 B || TSMC 4NP
 * Blackwell || N/A || N/A || N/A || N/A || N/A || N/A || 8 Gbit/s HBM3e || 8192-bit || 8 TB/sec || 192 GB HBM3e || N/A || N/A || N/A || 3.5 POPS || N/A || 7 PFLOPS || N/A || 1.98 PFLOPS || 1.98 PFLOPS || 989 TFLOPS || 30 TFLOPS || 1.8 TB/sec || GB100 || N/A || N/A || 700 W || N/A || 208 B || TSMC 4NP
 * Hopper || SXM5 || 16896 || 4608 || 16896 || N/A || 1980 MHz || 6.3 Gbit/s HBM3e || 6144-bit || 4.8 TB/sec || 141 GB HBM3e || 67 TFLOPS || 34 TFLOPS || N/A || 1.98 POPS || N/A || N/A || N/A || 990 TFLOPS || 990 TFLOPS || 495 TFLOPS || 67 TFLOPS || 900 GB/sec || GH100 || 25344 KB (192 KB × 132) || 51200 KB || 1000 W || 814 mm2 || 80 B || TSMC 4N
 * Hopper || SXM5 || 16896 || 4608 || 16896 || N/A || 1980 MHz || 5.2 Gbit/s HBM3 || 5120-bit || 3.35 TB/sec || 80 GB HBM3 || 67 TFLOPS || 34 TFLOPS || N/A || 1.98 POPS|| N/A || N/A || N/A || 990 TFLOPS || 990 TFLOPS || 495 TFLOPS || 67 TFLOPS || 900 GB/sec || GH100 || 25344 KB (192 KB × 132) || 51200 KB || 700 W || 814 mm2 || 80 B || TSMC 4N
 * Ampere || SXM4 || 6912 || 3456 || 6912 || N/A || 1410 MHz || 3.2 Gbit/s HBM2e || 5120-bit || 1.52 TB/sec || 80 GB HBM2e || 19.5 TFLOPS || 9.7 TFLOPS || N/A || 624 TOPS || 19.5 TOPS || N/A || 78 TFLOPS || 312 TFLOPS || 312 TFLOPS || 156 TFLOPS || 19.5 TFLOPS || 600 GB/sec || GA100 || 20736 KB (192 KB × 108) || 40960 KB || 400 W || 826 mm2 || 54.2 B || TSMC N7
 * Ampere || SXM4 || 6912 || 3456 || 6912 || N/A || 1410 MHz || 2.4 Gbit/s HBM2 || 5120-bit || 1.52 TB/sec || 40 GB HBM2 || 19.5 TFLOPS || 9.7 TFLOPS || N/A || 624 TOPS || 19.5 TOPS || N/A || 78 TFLOPS || 312 TFLOPS || 312 TFLOPS || 156 TFLOPS || 19.5 TFLOPS || 600 GB/sec || GA100 || 20736 KB (192 KB × 108) || 40960 KB || 400 W || 826 mm2 || 54.2 B || TSMC N7
 * Volta || SXM3 || 5120 || 2560 || N/A || 5120 || 1530 MHz || 1.75 Gbit/s HBM2 || 4096-bit || 900 GB/sec || 32 GB HBM2 || 15.7 TFLOPS || 7.8 TFLOPS || 62 TOPS || N/A || 15.7 TOPS || N/A || 31.4 TFLOPS || 125 TFLOPS || N/A || N/A || N/A || 300 GB/sec || GV100 || 10240 KB (128 KB × 80) || 6144 KB || 350 W || 815 mm2 || 21.1 B || TSMC 12FFN
 * Volta || SXM2 || 5120 || 2560 || N/A || 5120 || 1530 MHz || 1.75 Gbit/s HBM2 || 4096-bit || 900 GB/sec || 16 GB HBM2 || 15.7 TFLOPS || 7.8 TFLOPS || 62 TOPS || N/A || 15.7 TOPS || N/A || 31.4 TFLOPS || 125 TFLOPS || N/A || N/A || N/A || 300 GB/sec || GV100 || 10240 KB (128 KB × 80) || 6144 KB || 300 W || 815 mm2 || 21.1 B || TSMC 12FFN
 * Pascal || SXM/SXM2 || N/A || 1792 || 3584 || N/A || 1480 MHz || 1.4 Gbit/s HBM2 || 4096-bit || 720 GB/sec || 16 GB HBM2 || 10.6 TFLOPS || 5.3 TFLOPS || N/A || N/A || N/A || N/A || 21.2 TFLOPS || N/A || N/A || N/A || N/A || 160 GB/sec || GP100 || 1344 KB (24 KB × 56) || 4096 KB || 300 W || 610 mm2 || 15.3 B || TSMC 16FF+