Template:NvidiaDgxAccelerators: Difference between revisions
Appearance
Content deleted Content added
doc |
m Added SXM2 variant of P100 |
||
Line 22: | Line 22: | ||
| [[Volta (microarchitecture)|Volta]] || SXM2 || 5120 || 2560 || N/A || 5120 || 1530 MHz || 1.75Gbit/s HBM2 || 4096-bit || 900GB/sec || 16GB || 15.7 TFLOPs || 7.8 TFLOPs || 62 TOPs || N/A || 15.7 TOPs || 31.4 TFLOPs || 125 TFLOPs|| N/A || N/A || N/A || 300GB/sec || GV100 || 10240KB(128KBx80) || 6144 KB || 300W || 815 mm2 || 21.1B || TSMC 12 nm FFN |
| [[Volta (microarchitecture)|Volta]] || SXM2 || 5120 || 2560 || N/A || 5120 || 1530 MHz || 1.75Gbit/s HBM2 || 4096-bit || 900GB/sec || 16GB || 15.7 TFLOPs || 7.8 TFLOPs || 62 TOPs || N/A || 15.7 TOPs || 31.4 TFLOPs || 125 TFLOPs|| N/A || N/A || N/A || 300GB/sec || GV100 || 10240KB(128KBx80) || 6144 KB || 300W || 815 mm2 || 21.1B || TSMC 12 nm FFN |
||
|- |
|- |
||
| [[Pascal (microarchitecture)|Pascal]] || |
| [[Pascal (microarchitecture)|Pascal]] || SXM/SXM2 || N/A || 1792 || 3584 || N/A || 1480 MHz || 1.4Gbit/s HBM2 || 4096-bit || 720GB/sec || 16GB || 10.6 TFLOPs || 5.3 TFLOPs || N/A || N/A || N/A || 21.2 TFLOPs || N/A || N/A || N/A || N/A || 160GB/sec || GP100 || 1344KB(24KBx56) || 4096 KB || 300W || 610 mm2 || 15.3B || TSMC 16 nm FinFET+ |
||
{{Scrolling table/end}}<noinclude> |
{{Scrolling table/end}}<noinclude> |
Revision as of 18:08, 17 March 2023
Comparison of accelerators used in DGX:[1][2][3]
Accelerator |
---|
H100 |
A100 80GB |
A100 40GB |
V100 32GB |
V100 16GB |
P100 |
Architecture | Socket | FP32 CUDA Cores |
FP64 Cores (excl. Tensor) |
Mixed INT32/FP32 Cores |
INT32 Cores |
Boost Clock |
Memory Clock |
Memory Bus Width |
Memory Bandwidth |
VRAM | Single Precision (FP32) |
Double Precision (FP64) |
INT8 (non-Tensor) |
INT8 Dense Tensor |
INT32 | FP16 | FP16 Dense Tensor |
bfloat16 Dense Tensor |
TensorFloat-32 (TF32) Dense Tensor |
FP64 Dense Tensor |
Interconnect (NVLink) |
GPU | L1 Cache Size | L2 Cache Size | TDP | GPU Die Size |
Transistor Count |
Manufacturing Process |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Hopper | SXM5 | 16896 | 4608 | 16896 | K.A. | 1780 MHz | 4.8Gbit/s HBM3 | 5120-bit | 3072GB/sec | 80GB | 60 TFLOPs | 30 TFLOPs | K.A. | 4000 TOPs | K.A. | K.A. | 2000 TFLOPs | 2000 TFLOPs | 1000 TFLOPs | 60 TFLOPs | 900GB/sec | GH100 | 25344KB(192KBx132) | 51200 KB | 700W | 814 mm2 | 80B | TSMC 4 nm N4 |
Ampere | SXM4 | 6912 | 3456 | 6912 | K.A. | 1410 MHz | 3.2Gbit/s HBM2 | 5120-bit | 2039GB/sec | 80GB | 19.5 TFLOPs | 9.7 TFLOPs | K.A. | 624 TOPs | 19.5 TOPs | 78 TFLOPs | 312 TFLOPs | 312 TFLOPs | 156 TFLOPs | 19.5 TFLOPs | 600GB/sec | GA100 | 20736KB(192KBx108) | 40960 KB | 400W | 826 mm2 | 54.2B | TSMC 7 nm N7 |
Ampere | SXM4 | 6912 | 3456 | 6912 | K.A. | 1410 MHz | 2.4Gbit/s HBM2 | 5120-bit | 1555GB/sec | 40GB | 19.5 TFLOPs | 9.7 TFLOPs | K.A. | 624 TOPs | 19.5 TOPs | 78 TFLOPs | 312 TFLOPs | 312 TFLOPs | 156 TFLOPs | 19.5 TFLOPs | 600GB/sec | GA100 | 20736KB(192KBx108) | 40960 KB | 400W | 826 mm2 | 54.2B | TSMC 7 nm N7 |
Volta | SXM3 | 5120 | 2560 | K.A. | 5120 | 1530 MHz | 1.75Gbit/s HBM2 | 4096-bit | 900GB/sec | 32GB | 15.7 TFLOPs | 7.8 TFLOPs | 62 TOPs | K.A. | 15.7 TOPs | 31.4 TFLOPs | 125 TFLOPs | K.A. | K.A. | K.A. | 300GB/sec | GV100 | 10240KB(128KBx80) | 6144 KB | 350W | 815 mm2 | 21.1B | TSMC 12 nm FFN |
Volta | SXM2 | 5120 | 2560 | K.A. | 5120 | 1530 MHz | 1.75Gbit/s HBM2 | 4096-bit | 900GB/sec | 16GB | 15.7 TFLOPs | 7.8 TFLOPs | 62 TOPs | K.A. | 15.7 TOPs | 31.4 TFLOPs | 125 TFLOPs | K.A. | K.A. | K.A. | 300GB/sec | GV100 | 10240KB(128KBx80) | 6144 KB | 300W | 815 mm2 | 21.1B | TSMC 12 nm FFN |
Pascal | SXM/SXM2 | K.A. | 1792 | 3584 | K.A. | 1480 MHz | 1.4Gbit/s HBM2 | 4096-bit | 720GB/sec | 16GB | 10.6 TFLOPs | 5.3 TFLOPs | K.A. | K.A. | K.A. | 21.2 TFLOPs | K.A. | K.A. | K.A. | K.A. | 160GB/sec | GP100 | 1344KB(24KBx56) | 4096 KB | 300W | 610 mm2 | 15.3B | TSMC 16 nm FinFET+ |
Template documentation
This template's documentation is missing, inadequate, or does not accurately describe its functionality or the parameters in its code. Please help to expand and improve it. |
This template has not been added to any categories. Please help out by adding categories to it so that it can be listed with similar templates. |
- ^ Smith, Ryan (March 22, 2022). "NVIDIA Hopper GPU Architecture and H100 Accelerator Announced: Working Smarter and Harder". AnandTech.
- ^ Smith, Ryan (May 14, 2020). "NVIDIA Ampere Unleashed: NVIDIA Announces New GPU Architecture, A100 GPU, and Accelerator". AnandTech.
- ^ "NVIDIA Tesla V100 tested: near unbelievable GPU power". TweakTown. September 17, 2017.