We've been hearing about NVIDIA's next-generation Ampere for the past two years, but the company is finally ready to talk about its next-generation GPU architecture. Though you won't hear any info today about GeForce RTX 30 Series of consumer gaming graphics cards, NVIDIA is discussing details about Ampere for machine learning data center and HPC markets. In short, this version of Ampere is the biggest most powerful GPU NVIDIA has ever made, and the company offers that it's also the world's biggest 7nm chip. There's no doubt, it's massive.
According to NVIDIA, its Ampere-based A100 GPU is already in full production and shipping to customers. Although the company isn't getting down to the nitty gritty about architectural details yet, it says that the A100 represents the single largest uplift in performance ever, moving from GPU generation to generation.
With regards to what specs/performance figures we do know, the A100 features a whopping 54 billion processors, which allows it to take the crown as the world's largest processor built on 7nm tech. And the Tensor Cores use FP32 precision to allow for a 20x uplift in AI performance. And when it comes to FP64 performance, the Tensor Cores can provide a 2.5x performance boost compared to its predecessor with respect to HPC applications.
Some other Ampere-specific features include Multi-instance GPU, aka MIG, which allows an A100 GPU to be sliced up into up to seven discrete instances. That way, the raw power of the chip can be divided up for specific tasks based on job requirements. Ampere also integrated a third-generation NVLink design, which doubles the performance of interconnect between multiple GPUs for improved scaling.
“NVIDIA A100 GPU is a 20x AI performance leap and an end-to-end machine learning accelerator — from data analytics to training to inference," said NVIDIA founder and CEO Jensen Huang. "For the first time, scale-up and scale-out workloads can be accelerated on one platform. NVIDIA A100 will simultaneously boost throughput and drive down the cost of data centers.”
Those A100 GPUs will also make their way into NVIDIA's third-generation DGX AI system, with 5 TFLOPs of AI performance. Jensen gave us an early "taste" of DGX A100 when he pulled a rig out of his oven on Tuesday. At the time, he declared it to be “the world’s largest graphics card”.
The DGX A100 has a total of eight A100 GPUs, along with 320GB of memory (12.4TB per second in bandwidth). The system is also equipped with Mellanox HDR 200Gbps interconnects. As we mentioned before, each A100 GPU can support up to 7 instances, which means that with 8 GPUs onboard, the DGX A100 can support a grand total of 56 instances to diversify the workload.
But NVIDIA is also thinking past just the DGX A100, and has announced the development of the DGX SuperPOD, which combines the power of 140 DGX A100 systems linked using the aforementioned Mellanox interconnects. Together, you're looking at 700 petaflops of AI computing power, which can be used for anything from medical research to helping analyze COVID-19, as we've seen with the Folding@Home project.
NVIDIA looks be getting off to a strong start with Ampere and the A100, but we're of course looking forward to seeing what the company has in store for the enthusiast market. It's been over 18 months since NVIDIA launched its Turing architecture with the GeForce RTX 20 family, and the expectations for the the GeForce RTX 30 family are incredibly high.