Table of contents
AMD has introduced revolutionary AI accelerators, Instinct MI325X and MI355X, which promise to transform the landscape of high-performance computing. Equipped with massive HBM3e memory and the latest architectures, these accelerators offer unprecedented performance, opening new horizons for AI research.
Instinct MI325X: 256 GB of Memory and Impressive Performance
The announced MI325X accelerator comes with 256 GB of HBM3e memory, allowing it to process vast amounts of data at incredible speeds. The CDNA 3 architecture delivers up to 2.6 PFLOPs of computing power for FP8 and 1.3 PFLOPs for FP16, making it an ideal tool for the most demanding deep learning tasks.
AMD claims significant superiority of the MI325X over competitors, showing up to 40% performance gains compared to the NVIDIA H200 in certain workloads.
Servers equipped with multiple MI325X accelerators can provide even more impressive results, offering terabytes of memory and tens of petaflops of computing power.
MI355X: The Future is Here
Next year, AMD plans to release an even more powerful accelerator, the MI355X. Based on the latest CDNA 4 architecture and built using a 3nm process, the MI355X will feature 288 GB of HBM3e memory and offer a significant performance boost compared to its predecessor.
Key Features of the MI355X:
- Massive memory capacity: 288 GB HBM3e for handling the largest AI models.
- New CDNA 4 architecture: Provides a substantial increase in performance and energy efficiency.
- Support for new data formats: FP4 and FP6 for even greater computational accuracy.
- High memory bandwidth: Up to 64 TB/s to speed up data transfer.
ROCm 6.2: Software Advancements
Alongside hardware improvements, AMD continues to enhance its ROCm software. The new version 6.2 promises even higher performance in AI model training and inference tasks.
Conclusion
AMD is demonstrating impressive progress in high-performance computing, offering powerful AI accelerators that meet the needs of the most demanding researchers and developers. The Instinct MI325X and MI355X set new industry standards and open new opportunities for the advancement of artificial intelligence.
AMD Instinct AI Accelerators | |||||
---|---|---|---|---|---|
Accelerator Name | AMD Instinct MI400 | AMD Instinct MI350X | AMD Instinct MI325X | AMD Instinct MI300X | AMD Instinct MI250X |
GPU Architecture | CDNA Next | CDNA 4 | Aqua Vanjaram (CDNA 3) | Aqua Vanjaram (CDNA 3) | Aldebaran (CDNA 2) |
GPU Process Node | TBD | 3nm | 5nm+6nm | 5nm+6nm | 6nm |
GPU Chiplets | TBD | 8 (MCM) | 8 (MCM) | 8 (MCM) | 2 (MCM) 1 (Per Die) |
GPU Cores | TBD | TBD | 19,456 | 19,456 | 14,080 |
GPU Clock Speed | TBD | TBD | 2100 MHz | 2100 MHz | 1700 MHz |
INT8 Compute | TBD | TBD | 2614 TOPS | 2614 TOPS | 383 TOPS |
FP6/FP4 Compute | TBD | 9.2 PFLOPs | N/A | N/A | N/A |
FP8 Compute | TBD | 4.6 PFLOPs | 2.6 PFLOPs | 2.6 PFLOPs | N/A |
FP16 Compute | TBD | 2.3 PFLOPs | 1.3 PFLOPs | 1.3 PFLOPs | 383 TFLOPs |
FP32 Compute | TBD | TBD | 163.4 TFLOPs | 163.4 TFLOPs | 95.7 TFLOPs |
FP64 Compute | TBD | TBD | 81.7 TFLOPs | 81.7 TFLOPs | 47.9 TFLOPs |
VRAM | TBD | 288 HBM3e | 256 GB HBM3e | 192 GB HBM3 | 128 GB HBM2e |
Infinity Cache | TBD | TBD | 256 MB | 256 MB | N/A |
Memory Clock | TBD | 8.0 Gbps? | 5.9 Gbps | 5.2 Gbps | 3.2 Gbps |
Memory Bus | TBD | 8192-bit | 8192-bit | 8192-bit | 8192-bit |
Memory Bandwidth | TBD | 8 TB/s | 6.0 TB/s | 5.3 TB/s | 3.2 TB/s |
Form Factor | TBD | OAM | OAM | OAM | OAM |
Cooling | TBD | Passive Cooling | Passive Cooling | Passive Cooling | Passive Cooling |
TDP (Max) | TBD | TBD | 1000W | 750W | 560W |