Nvidia A100 GPU is Official on air with 432 active tensor cores

So, the turn has come to fully official data on the new Nvidia Ampere GPU and products related to this GPU.

Nvidia has just published a press release and video from presentations including the GPU A100. And we will start with the name. The fact is that Nvidia itself created some kind of confusion with the names. The company everywhere talks about a product called the Nvidia A100 GPU or the Nvidia A100 Tensor Core GPU. And one would think that the top-end Ampere GPU is called the A100, however, judging by various data and information from third-party sources, this is not so.


Nvidia A100 GPU is Official on air with 432 active tensor cores

The Nvidia A100 GPU is not a graphics processor in itself. This is a graphics accelerator that we called the Tesla A100, but Nvidia itself does not call it that at the moment. Either we are dealing with rebranding, or later the Nvidia A100 GPU will simply appear on the Nvidia website in the corresponding section under the name Tesla A100.

In any case, when Nvidia talks about the Nvidia A100 GPU, it means a ready-made computing accelerator. The graphics processor underlying this accelerator itself is apparently still called the GA100.  


We already knew that the GA100 contains an incredible 54 billion transistors and is produced by the seven-meter process technology at TSMC facilities. But there was conflicting information regarding the configuration. So, the GA100 GPU contains 8192 CUDA cores, and this is 52% more than the GV100 GPU. The GPU also includes 512 third-generation tensor cores and six HBM2 memory modules with a 6144-bit memory bus. Interestingly, the GV100 GPU had more tensor kernels, but in the case of Ampere architecture, these are new third-generation kernels that support floating-point operations with single and double precision for the first time.


Nvidia A100 GPU is Official on air with 432 active tensor cores

At the same time, the A100 or Tesla A100 accelerator has a stripped-down GA100 GPU with 6912 active CUDA cores and 432 active tensor cores, as well as five HBM2 modules (40 GB) with a 5120-bit memory bus.

That is, in the case of the A100, the GPU has 15% of all available cores disabled, and this is strange. Perhaps Nvidia is preparing something even more productive, but usually, the company shows the top-end solution right away. It can be assumed that the GPU simply turned out to be so complex that the output of completely suitable crystals is now scanty, and Nvidia had to go on cutting GPUs for the top-end product.


It is also worth noting that the performance of the A100 accelerator is 19.5 TFLOPS (FP32) or 9.7 TFLOPS (FP64), which is far from twice as much as the Tesla V100 (15.7 and 7.8 TFLOPS, respectively). However, it is precisely in problems related to artificial intelligence that the new accelerator sometimes surpasses the old one by six to seven times, which is due to new tensor kernels that are 20 times higher than the nuclei of the previous generation.


Nvidia A100 GPU is Official on air with 432 active tensor cores

Nvidia also confirmed the unusual Multi-instance GPU feature of the A100 accelerator. It allows you to "divide" the graphics processor of the card into seven "separate" GPUs, each of which will be engaged in a separate task. It is also possible to create intermediate configurations, but there are no details in the press release.


A couple of words should also be said about the DGX A100 station, worth $ 200,000, which includes eight A100 accelerators. As it turned out, Nvidia chose AMD processors for its new stations, rather than Intel, as before. More specifically, the DGX A100 contains two 64-core Epyc 7742 CPUs, as well as 1 TB of RAM. Station performance reaches 5 PTFLOPS in AI tasks and 10 POPS in INT8 format.

Source = Nvidia

0 Response to "Nvidia A100 GPU is Official on air with 432 active tensor cores"

Post a Comment

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel

_