The Most Powerful Compute Platform for Every Workload
The lol体育外围 (在线看体育直播平台下载) delivers unprecedented
acceleration—at every scale—to power the world's highest performing elastic data centers for AI, data analytics, and high-performance computing (HPC) applications. As the engine of
the NVIDIA data center platform, A100 provides up to 20X higher
performance over the prior NVIDIA Volta™ generation. A100 can
efficiently scale up or be partitioned into seven isolated GPU
instances with Multi-Instance GPU (MIG), providing a unified
platform that enables elastic data centers to dynamically adjust
to shifting workload demands.
NVIDIA A100 Tensor Core technology supports a broad range
of math precisions, providing a single accelerator for every
workload. The latest generation A100 80GB doubles GPU memory
and debuts the world’s fastest memory bandwidth at 2 terabytes
per second (TB/s), speeding time to solution for the largest
models and most massive datasets.
A100 is part of the complete NVIDIA data center solution that
incorporates building blocks across hardware, networking,
software, libraries, and optimized AI models and applications
from the NVIDIA NGC™ catalog. Representing the most powerful
end-to-end AI and HPC platform for data centers, it allows
researchers to deliver real-world results and deploy solutions
into production at scale.
Applications
HPC & AI
HPC and AI go hand in hand. HPC has the compute infrastructure, storage, and networking that lays the groundwork for AI training with accurate and reliable models. Additionally, there are a lot of precision choices for either HPC or AI workloads.

Engineering & Sciences
Big data and computational simulations are common needs of engineers and scientists. High parallel processing, low latency, and high bandwidth help create an environment for server virtualization.

Cloud
On-premise HPC continues to grow, yet cloud HPC is growing at a faster rate. By moving to the cloud companies can quickly and easily utilize compute resources by demand. Cloud computing can use the latest and greatest technology.

NVIDIA A100 - Ingenious Technology
NVIDIA A100 GPU 旨在通过新的 NVIDIA Ampere 架构和优化提供尽可能多的 AI 和 HPC 计算能力。与之前的 12 纳米技术相比,A100 基于 TSMC 7 纳米 N7 FinFET,提高了晶体管密度、性能和能效。借助 Ampere GPU 中新的多实例 GPU (MIG) 功能,A100 可以为云服务提供商创建最佳的虚拟化 GPU 环境。
- NVIDIA安培架构:
无论是使用 MIG 将 A100 GPU 划分为更小的实例,还是使用 NVLink 连接多个GPU 可加速大规模工作负载,A100 可以轻松处理不同规模的加速需求,从最小的作业到最大的多节点工作负载。 A100 的多功能性意味着 IT 经理可以全天候最大限度地利用其数据中心中每个 GPU 的效用。 - 第三代张量核心:
NVIDIA A100 提供 312 teraFLOPS (TFLOPS) 的深度学习性能。与 NVIDIA Volta GPU 相比,深度学习训练的 Tensor 每秒浮点运算 (FLOPS) 是 20 倍,深度学习推理的 Tensor tera 运算每秒 (TOPS) 是 20 倍。 - 下一代 NVLink:
A100 中的 NVIDIA NVLink 与上一代。与 NVIDIA NVSwitch™ 结合使用时,最多可互连 16 个 A100 GPU以高达 600 GB/秒 (GB/sec) 的速度运行,在单个服务器上释放最高的应用程序性能。 NVLink 通过 HGX A100 服务器主板在 A100 SXM GPU 中可用,在 PCIe GPU 中通过 NVLink Bridge 最多可连接 2 个 GPU。 - 多实例 GPU (MIG):
一个 A100 GPU 可以分成多达七个 GPU 实例,在硬件级别完全隔离拥有高带宽内存、高速缓存和计算核心。 MIG 让开发人员能够为其所有应用程序获得突破性的加速,IT 管理员可以为每项工作提供大小合适的 GPU 加速,优化利用率并扩展对每个用户和应用程序的访问。 - 高带宽内存 (HBM2E):
凭借高达 80 GB 的 HBM2e,A100 提供了超过 2TB/s 的全球最快 GPU 内存带宽,以及 95% 的动态随机存取内存 (DRAM) 使用效率。 A100 提供比上一代高 1.7 倍的内存带宽。 - 结构稀疏性:
AI 网络有数百万到数十亿个参数。并非所有这些参数都是准确预测所必需的,有些参数可以转换为零,从而使模型“稀疏”而不影响准确性。 A100 中的 Tensor Core 可为稀疏模型提供高达 2 倍的性能。虽然稀疏性特征更容易有利于 AI 推理,但它也可以提高模型训练的性能。
NVIDIA A100 - Ingenious Technology
Server for NVIDIA HGX A100 8-GPU Solution
Server for NVIDIA A100 PCIe GPU- 10 GPU System
High-performance
A100 achieves top tier performance in a broad range of math precisions, with the SXM module doubling that of PCIe GPU in TF32, FP16, and BFLOAT16.
Scalability
Combining NVLink with high speed connections it is possible to create large compute clusters as A100 can scale to thousands of A100s using NVSwitch.
Fast Throughput
Fast GPU-GPU and CPU-GPU communication achieved with A100 using NVLink, NVSwitch, and InfiniBand. A100 hits up to 2,039GB/s throughput.
High Utilization
Multi-instance GPU technology allows for a single A100 80GB GPU to partition into seven MIGs for consistent and predictable utilization of resources.
A100 80GB PCIe | A100 40GB SXM | A100 80GB SXM | |
---|---|---|---|
FP64 | 9.7 TFLOPS | ||
FP64 Tensor Core | 19.5 TFLOPS | ||
FP32 | 19.5 TFLOPS | ||
Tensor Float 32 (TF32) | 156 TFLOPS | 312 TFLOPS* | ||
BFLOAT16 Tensor Core | 312 TFLOPS | 624 TFLOPS* | ||
FP16 Tensor Core | 312 TFLOPS | 624 TFLOPS* | ||
INT8 Tensor Core | 624 TOPS | 1248 TOPS* | ||
GPU Memory | 80GB HBM2e | 40GB HBM2 | 80GB HBM2e |
GPU Memory Bandwidth | 1,935GB/s | 1,555GB/s | 2,039GB/s |
Max Thermal Design Power (TDP) | 300W | 400W | 400W |
Multi-Instance GPU | Up to 7 MIGs @ 10GB | Up to 7 MIGs @ 5GB | Up to 7 MIGs @ 10GB |
Form Factor | PCIe | SXM | |
Interconnect | NVIDIA® NVLink® Bridge for 2 GPUs: 600GB/s ** PCIe Gen4: 64GB/s |
NVLink: 600GB/s PCIe Gen4: 64GB/s |
|
Server Options | Partner and NVIDIA-Certified系统™ with 1-8 GPUs | NVIDIA HGX™ A100-Partner and NVIDIA-Certified系统 with 4,8, or 16 GPUs NVIDIA DGX™ A100 with 8 GPUs |
* 具有稀疏性
** 通过 HGX A100 服务器主板的 SXM4 GPU; PCIe GPUs via NVLink Bridge for up to 2 GPUs
NVIDIA-Certified系统™
Complex AI workloads, including clusters, are becoming more common and system integrators and IT staff must quickly adapt to the changing technology. In order to improve system compatibility and confidence, NVIDIA has introduced the NVIDIA-Certified系统 program that validates servers based on hardware and NVIDIA NGC Catalog's software. Currently the NVIDIA-Certified系统 focus on NVIDIA Ampere architecture and NVIDIA Mellanox network adapters, but the program will expand. As well, customers that are familiar with NVIDIA NGC Support Services can use the services for NVIDIA-Certified.

Related Products
1/6
2/6
6/6
Bring Your Ideas Faster to Fruition
Email Sales
