AMD (AMD) confirmed a major supercomputer win on Wednesday. And along the way, it also confirmed some things about its server CPU and GPU plans.
On Wednesday afternoon, AMD announced that its CPUs and GPUs will power a Department of Energy supercomputer that's being built by HP Enterprise's (HPE) Cray unit and is set to become the world's fastest computing system when it's delivered in early 2023.
The supercomputer, known as El Capitan, will cost $600 million, be used for nuclear simulations and provide more than 2 exalops (more than 2 million teraflops) of processing power. That makes it 10 times more powerful than today's fastest supercomputer (the DOE's 200-petaflop Summit system, which is powered by IBM (IBM) Power9 CPUs and Nvidia (NVDA) Tesla V100 GPUs).
The announcement comes 10 months after AMD and Cray announced that the DOE's Frontier supercomputer, which is due at some point in 2021, will be powered by AMD CPUs and GPUs. Frontier will deliver 1.5-plus exaflops of performance, and will be used for a variety of scientific research.
But while El Capitan and Frontier have a few things in common, there's one key hardware difference: Frontier will rely heavily on custom AMD chips and technology, whereas El Capitan appears to generally rely on solutions that will be broadly made available to AMD customers.
AMD has said that Frontier will rely on a custom Epyc server CPU and a "purpose-built" Radeon Instinct server GPU. The company has also noted that a custom version of its Infinity Fabric interconnect technology, which today is used to connect multiple CPU sockets and chips within individual CPU packages, would be used to (among other things) connect Frontier's Epyc CPUs and Radeon Instinct GPUs.
By contrast, AMD makes no mention of custom solutions in its El Capitan press release.
The CPUs going inside of El Capitan will be part of AMD's fourth-gen Epyc line -- it's codenamed Genoa and will rely on AMD's Zen 4 CPU core microarchitecture. Genoa is expected to use a 5-nanometer (5nm) Taiwan Semiconductor (TSM) manufacturing process and is set to succeed AMD's third-gen, Milan Epyc line, which is expected to launch later this year and will rely on the Zen 3 microarchitecture and a 7nm TSMC process.
El Capitan's GPUs will be "next generation" Radeon Instinct parts that (notably) will be "based on a new compute-optimized architecture for workloads including [high-performance computing] and AI." That suggests the architecture will be the successor to the Graphics Core Next (GCN) architecture currently used by Radeon Instinct GPUs.
As for Infinity Fabric, AMD says the third-gen version of its interconnect technology will be used to connect El Capitan's CPUs and GPUs, while adding the solution will support unified memory access between CPUs and GPUs. The second-gen version of Infinity Fabric rolled out last year, with the launch of AMD's well-received, second-gen, Rome Epyc CPU line.
El Capitan's apparent reliance on off-the-shelf solutions fits with comments made by AMD SVP Forrest Norrod in a November interview. Then, Norrod indicated that "mass-market" solutions that deliver the kind of CPU/GPU pairing provided by Frontier will be made available shortly after Frontier is delivered.
Differences aside, El Capitan and Frontier are both good examples of Epyc's recent momentum in the high-performance computing (HPC) market. Along with cloud data centers, HPC deployments have been a particular strong point for Rome, thanks in part to its ability to scale up to 64 cores and its very strong performance when handling floating-point operations.
El Capitan and Frontier are also nice reference wins for AMD as it tries to chip away at Nvidia's dominant position in the market for GPU accelerators for supercomputers. The most recent Top500 supercomputer list includes dozens of systems featuring Nvidia GPUs, but none featuring AMD GPUs.
The El Capitan announcement comes ahead of AMD's analyst day event, which starts at 4 P.M. ET on Thursday. The event might yield some additional details about AMD's plans for Genoa and future Radeon Instinct GPUs.