AMD (AMD) just unveiled a slew of new supercomputer, server and cloud computing deals for its latest server CPU line.
And in a talk with TheStreet about the announcements, the head of AMD's server CPU and GPU businesses signaled that the use of a new microarchitecture will help its next-gen server CPUs deliver sizable performance gains.
On Monday morning, as the SC19 supercomputing conference kicked off in Denver, AMD made a string of data center-related announcements. Among other things, the company:
- Announced several new supercomputer design wins for its second-gen Epyc server CPUs (codenamed Rome), which thanks to both major performance gains and aggressive pricing have been pretty well-received since launching in August. These include deals with European government and research agencies, as well as with the San Diego Supercomputer Center.
- Announced that Amazon (AMZN) Web Services (AWS), which began supporting first-gen Epyc CPUs (codenamed Naples) last year, plans to launch four cloud computing instances that rely on Rome CPUs, and which are focused on computing-intensive workloads.
- Disclosed that a pair of previously-announced Microsoft Azure computing instances that are powered by Rome CPUs and optimized for high-performance computing (HPC) instances are now available in preview mode.
- Unveiled an updated version of the open-source ROCm software platform for writing applications that can be accelerated by AMD's Radeon GPUs (ROCm 3.0). Among ROCm's new features are optimizations for using Epyc CPUs and Radeon Instinct server GPUs in tandem, and improvements to its ability to help developers port apps relying on the widely-adopted CUDA programming model for Nvidia (NVDA) GPUs to the HIP programming language supported by ROCm.
- Revealed several new Rome-powered servers from major OEMs. These include a pair of new HP Enterprise (HPE) servers that (unlike the Rome servers they unveiled in August) support the high-speed PCIe 4.0 interface for connecting to things such as GPUs, FPGAs, network adapters and SSDs.
The new AWS Rome instances are among "the highest-performing cloud computing instances available on the market," said Forrest Norrod, who as GM of AMD's Datacenter and Embedded Solutions Business Group oversees its server CPU and GPU operations. Separately, when asked about how the AWS instances would be priced relative to comparable instances relying on Intel (INTC) server CPUs, AMD spokesman Gary Silcott noted that to date, AMD-powered AWS instances have been consistently priced at a 10% discount to Intel-powered instances delivering similar performance and features.
Pairing Server CPUs and GPUs
Norrod said the new ROCm optimizations for using AMD's server CPUs and GPUs in tandem are a sign of where AMD is heading in terms of CPU/GPU pairing. The move comes five months after the Department of Energy announced it will be using AMD server CPUs and GPUs connected via the company's high-speed Infinity Fabric interconnect technology (currently used to connect chips within a CPU package) to power the world's most powerful supercomputer.
The supercomputer, known as Frontier and set to go live in 2021, will feature nodes that contain a custom Epyc CPU and four "purpose-built" Radeon Instinct GPUs. However, in line with comments made by CEO Lisa Su during an August interview with TheStreet, Norrod said AMD plans to offer solutions that provide a similar kind of CPU/GPU pairing via Infinity Fabric after Frontier launches.
"You will see systems and GPUs and CPUs from us shortly after that that are available on the mass-market...that do deliver that sort of capability," he said.
At the same time, Norrod promised AMD will also support open standards-based solutions for linking CPUs with GPUs and other accelerators -- for example, the CXL interconnect, which was originally developed by Intel. "We may want to add additional value, but we're never going to close our ecosystem," he said.
As AMD works on more tightly integrating its server CPUs and GPUs, Intel is prepping a server GPU relying on its next-gen, 7nm, manufacturing process node. On Sunday, the chip giant shared details about the GPU, which is codenamed Ponte Vecchio and expected to arrive in 2021.
AMD's Third-Gen Epyc CPUs (Milan)
Norrod also had some interesting comments about the performance gains that will be delivered by AMD's third-gen Epyc CPUs, which are codenamed Milan and expected to enter production around Q3 2020.
When asked about what kind of performance gain Milan's CPU core microarchitecture, which is known as Zen 3, will deliver relative to the Zen 2 microarchitecture that Rome relies on in terms of instructions processed per CPU clock cycle (IPC), Norrod observed that -- unlike Zen 2, which was more of an evolution of the Zen microarchitecture that powers first-gen Epyc CPUs -- Zen 3 will be based on a completely new architecture.
Norrod did qualify his remarks by pointing out that Zen 2 delivered a bigger IPC gain than what's normal for an evolutionary upgrade -- AMD has said it's about 15% on average -- since it implemented some ideas that AMD originally had for Zen but had to leave on the cutting board. However, he also asserted that Zen 3 will deliver performance gains "right in line with what you would expect from an entirely new architecture."
Milan's performance should also benefit some from moderately higher CPU clock speeds, thanks to its expected use of a more advanced, 7-nanometer (7nm), Taiwan Semiconductor (TSM) manufacturing process than the 7nm TSMC process used by Rome.
Speaking in general about its performance expectations, Norrod said -- at a time when Intel is promising double-digit IPC gains for future microarchitectures -- AMD is "confident [in] being able to drive significant IPC gains each generation." He also indicated that AMD's server CPU launches are set to rely on the "tick-tock" cadence that was once the hallmark of Intel CPU launches, with the launch of a CPU platform that relies on a new manufacturing process node but the same microarchitecture as the last platform (the "tick") followed by a platform that relies on a new microarchitecture but the same manufacturing process node (the "tock").
In this context, Rome represents a tick, thanks to its use of a 7nm process that's much more advanced than the 14nm process used by Naples, while Milan represents a tock, since it will feature a new microarchitecture but rely on a 7nm process. And presumably, AMD's fourth-gen Epyc platform -- it's codenamed Genoa, due in 2021 and expected by many to rely on TSMC's next-gen, 5nm, process node -- will represent another tick.
Driving Core Counts Higher
Whereas the most powerful Naples CPUs have 32 cores, the most powerful Rome CPUs have twice as many. And Norrod indicated that over time, AMD wants to keep driving core counts higher as manufacturing processes improve.
"There's a number of application areas that just continue to benefit from increasing core counts and increasing compute density," Norrod said. However, he emphasized that AMD also wants to make sure that it takes "a balanced approach" to increasing things such as compute density, memory bandwidth and I/O connectivity, so that additional horsepower isn't left "stranded" due to a bottleneck elsewhere.
2.5D and 3D Packaging Technologies
Intel has been pushing the envelope lately when it comes to developing technologies for placing multiple. Among other things, the company has commercialized EMIB, a "2.5D" packaging solution in which chip dies placed side-by-side communicate through a high-speed "silicon bridge" placed underneath them, and (more recently) Foveros, a 3D packaging solution in which multiple logic chips can be stacked on top of each other. And in July, the company unveiled co-EMIB, a solution for connecting multiple sets of chips stacked using Foveros.
By comparison, the "chiplet" packaging approach used by AMD's latest desktop and server CPUs -- it leans on Infinity Fabric to connect a package's chip dies -- is a 2D solution. However, Norrod made it clear that AMD is exploring new 2.5D and 3D packaging approaches, albeit while qualifying that his comments shouldn't be seen as being about any particular future product on AMD's roadmap.
"You should expect we will continue to drive super hard on packaging technology," he said. Norrod also pointed out that AMD has long used 2.5D packaging to pair memory chips with its GPUs.
Backing Open Standards
On a few different occasions, Norrod stressed AMD's interest in backing open standards for data center solutions. In addition to noting AMD's support for ROCm and interconnect technologies such as CXL, he also said that when it comes to storage class memory (SCM) technologies that aim to strike a middle ground between DRAM and NAND flash memory in terms of speed, density and cost, AMD "has a lot of focus" on standardized solutions that can be offered by multiple memory suppliers.
"People want standards. They don't want to be locked into a proprietary solution. They don't want to be locked into proprietary sourcing," he said. The comments come as Intel and (more recently) Micron (MU) each commercialize products relying on the 3D XPoint SCM technology that they co-developed.
Likewise, when asked about potentially supporting an alternative to Intel's One API initiative -- it aims to give developers a unified programming model for leveraging the resources of CPUs, GPUs, FPGAs and ASICs, both from Intel and others -- Norrod said AMD is "very open" to a standards-based approach to offering something like that.
To some extent, the adoption of programming models that support a number of different processor types would represent a challenge to Nvidia, which has built a massive ecosystem for the CUDA programming model for its GPUs. In August, Nvidia CEO Jensen Huang expressed skepticism about such efforts, arguing that making them work is more challenging than it might initially look.
For his part, Norrod said that the arrival of interconnect technologies (such as CXL and Infinity Fabric) that enable memory coherence (the ability to guarantee different processing elements in a system are accessing the same cached data) between CPUs and accelerators makes the use of a common programming model for different processor types more compelling, and that AMD is willing to work with partners to offer it.
Strong Demand -- And Adequate Supplies -- for 48 and 64-Core CPUs
As Milan gets prepped, AMD is from all indications seeing strong demand for Rome. This particularly seems to be the case for CPUs with high core counts, with Su indicating on AMD's Q3 earnings call that Rome's sales mix was skewing towards 48 and 64-core CPUs.
Norrod said demand for Rome CPUs with high core counts has been "tremendous" and "broader than [AMD] may have originally anticipated." He added that demand for high-core-count parts generally falls within three areas -- supercomputer/HPC deployments, major cloud deployments and scale-out enterprise workloads such as virtualization farms and analytics workloads -- and said the "vast majority" of AMD's HPC-related Rome sales have involved 64-core CPUs.
And though TSMC has reportedly been seeing stretched lead times for its 7nm node amid strong orders from a number of clients, Norrod insisted that supply constraints aren't an issue for AMD.
"I think we are absolutely able to supply whatever the market may demand," he said, while adding that AMD also doesn't expect any "meaningful constraints" on supply long-term.