It says something about how much Nvidia's data center efforts have broadened that the company has hosted not one, but two events this year that were jam-packed with data center-related chip, hardware and software announcements.
Back in May, Nvidia unveiled the A100, a flagship server GPU based on its Ampere architecture. The company also showed off an A100-powered enterprise server (the DGX A100), new edge computing and embedded computing platforms (the EGX A100 and EGX Jetson Xavier NX), a software framework for creating AI-powered recommendation systems (Merlin) and a software platform that lets engineers and designers collaborate in simulated 3D environments (Omniverse).
This time around, Nvidia revealed an important expansion (and rebrand) of its Mellanox unit's data processing unit (DPU) lineup. It also showed off new GPUs for visual effects and data science pros (the A6000 and A40), software platforms for GPU-accelerated videoconferencing and AR/VR content-streaming (Maxine and CloudXR) and an off-the-shelf solution for creating supercomputers out of DGX A100 clusters.
Here are a few takeaways for Nvidia's latest announcements, which were made as the company kicked off the latest edition of its GPU Technology Conference (GTC).
1. Nvidia Is Thinking Big When it Comes to DPUs -- Literally and Figuratively
A DPU -- more commonly referred to in the industry as a smartNIC -- is a server networking card that has the processing power to offload network, security and/or storage processing functions from a server's CPUs, thus letting the CPUs focus on app workloads. Nvidia seems pretty interested in having DPUs take over an even larger number of functions.
A week ago, Nvidia and VMware (VMW) announced that Nvidia's DPUs will be able to handle many of the tasks performed by VMware's server, storage and networking virtualization platforms. VMware also went over plans to have DPUs run firewalls that are "automatically tuned to protect specific application services," and to use them to help companies jointly manage bare-metal servers and servers running VMware's ESXi server virtualization software.
This week, Nvidia further expanded on its security vision for DPUs, revealing a partnership with Check Point Software (CHKP) and outlining a security architecture in which DPUs (by offloading the running of security agents from server CPUs) can help prevent a security attack from spreading throughout a data center, should a server be compromised. It also revealed plans to have DPUs handle AI-powered functions such as security, network traffic and video analytics.
Nvidia's BlueField-2X DPU. Source: Nvidia.
Those AI-powered functions, notably, will be made possible by next-gen DPUs that feature (in addition to ARM CPU cores) integrated Nvidia GPUs. In 2021, Nvidia plans to begin shipping a more conventional DPU card known as the BlueField-2 -- it packs 8 ARM cores, supports 200Gb/s links and was unveiled by Mellanox in Aug. 2019 -- and a bigger product known as the BlueField-2X, which also includes an Ampere GPU.
In addition, Nvidia unveiled DOCA, a software developer kit (SDK) for building DPU-accelerated apps, and it outlined an ambitious DPU roadmap. In 2022, Nvidia plans to launch the BlueField-3 and 3X, which will support 400Gb/s links and have 5 times as much CPU processing power as the BlueField-2 and 2X. And in 2023, it plans to launch the BlueField-4, which will both deliver major CPU and AI processing power gains and place a CPU and GPU on the same chip.
Nvidia does have some credible DPU/smartNIC competition. Intel (INTC) , Xilinx (XLNX) and Broadcom (AVGO) are players in this space, and there are also some well-funded startups out there. And some public cloud providers, such as AWS and Microsoft Azure, have developed proprietary smartNICs.
Nvidia's BlueField roadmap. Source: Nvidia.
But Nvidia does seem to be creating a differentiated platform with the help of its GPU IP and massive developer ecosystem, and given how strong the value proposition is becoming for placing DPUs within a variety of servers, there should be plenty of room for multiple winners.
2. Nvidia Isn't Wasting Time When it Comes to Working More Closely with ARM
Though Nvidia inked its deal to buy Arm less than a month ago and predicted the deal would take about 18 months to close, Nvidia is already pushing ahead with plans to extend several of its GPU software platforms to systems powered by Arm CPUs.
These include Nvidia's extensive array of tools, runtimes and libraries for AI and high-performance computing (HPC) software developers, its RAPIDS software libraries for creating GPU-accelerated data science and analytics apps, and its gaming GPU drivers.
Nvidia's efforts to support to ARM-based CPUs. Source: Nvidia.
Also, Nvidia says its engineers are working to support recently-announced features for Arm's Neoverse server CPU platform, as well as to better support cloud gaming services relying on Arm servers. And the company promises its EGX edge computing platform will work with both Arm and Intel/AMD CPUs.
It's worth remembering here that Nvidia is looking to build end-to-end chip and software platforms with Arm that cover server CPUs, GPUs and DPUs. But even if the Arm deal is shot down by regulators, a lot of the Arm-related efforts announced this week could prove useful, as Arm's data center footprint gradually grows.
3. Nvidia's Pro Visualization GPU Refresh Is Strategically Telling in a Couple of Ways
As expected, Nvidia has unveiled Ampere-based GPUs for professionals who need a lot of horsepower for workloads such as 3D modeling/simulation, video rendering, game development and VR content creation. Also as expected, the GPUs are promised to deliver major performance gains relative to the pro visualization GPUs based on Nvidia's Turing architecture that were launched in 2018 -- particularly for workloads that involve the use of deep learning algorithms and/or involve rendering scenes that feature ray-tracing.
What wasn't as telegraphed this time around is that Nvidia would not only launch a pro visualization GPU for desktop workstations (the A6000), but also one that's similar in many ways but was designed to go inside of servers (the A40). By contrast, the 2018 lineup features three GPUs that were all designed for workstations, though Nvidia did also create a server reference design for the GPUs.
Also a surprise: Nvidia ditched the Quadro branding that it has long used for its pro visualization GPUs, instead going with the A-series branding that it used for its A100 server GPU.
Between them, these two moves suggest Nvidia feels the center of gravity for GPU-accelerated visual effects and data science work is shifting at least in part from workstations to data centers, as public cloud platforms drive greater adoption of cloud render farms and virtual workstations.
One other thought: Given that Nvidia's most powerful Turing workstation GPU was known as the Quadro RTX 8000, it wouldn't be surprising to eventually see a more powerful Ampere workstation GPU named the A8000 (along with perhaps a similar server GPU) arrive.
4. Nvidia's Is Taking a Growing Interest in Cloud Software
One doesn't normally associate the Nvidia brand with cloud software, but it's now putting a lot of work into writing code that's meant to run in cloud data centers -- whether its own or those of third parties.
Maxine, a new platform for adding AI-powered videoconferencing services that rely on Nvidia GPUs, is a good case in point. Nvidia claims that Maxine-powered video streams can use up to 90% less bandwidth than streams relying on the popular H.264 compression standard, and that Maxine can also upscale video resolutions, filter out background noises, power services such as real-time translations and closed captions, and adjust a face's rotation and gaze so that it looks as if they're aligned with a user's camera.
CloudXR, meanwhile, helps developers relying on AWS cloud computing instances that feature Nvidia GPUs stream content to AR and VR headsets. And a new offering called Fleet Command uses cloud software to help companies deploy/update software on remote edge servers based on by Nvidia's EGX platform, as well as monitor the health of edge devices.
Of course, in each of these cases, Nvidia's main goal isn't to profit from software sales, but to drive greater adoption of its GPUs.
5. Demand for Nvidia's Latest Gaming GPUs Appears to Be Very Strong
Though GTC is largely focused on Nvidia's non-gaming efforts, CEO Jensen Huang did briefly talk during a press briefing about the initial reception seen for Nvidia's Ampere-based RTX 3080 and 3090 gaming GPUs, which launched last month.
Specifically -- with the $699 RTX 3080 and $1,499 RTX 3090 out of stock at major retailers and selling for large aftermarket premiums -- Huang declared that demand for the 3080 and 3090 "is much, much greater" than expected, and insisted the products "have a demand issue, not a supply issue." He also reported hearing from retailers that "they haven't seen a phenomenon like this in over a decade of computing," and forecast that 3080/3090 shortages will last until 2021.
Chalk all this up to both the large price/performance gains the GPUs (the 3080 especially) deliver relative to Turing gaming GPUs, as well as a very strong demand backdrop for just about anything gaming-related.
All of this of course bodes well for holiday season sales for Nvidia's Ampere gaming lineup -- not just the 3080 and 3090, but also the $499 RTX 3070, whose launch has been delayed to Oct. 29 in an attempt to prevent shortages from getting out of hand.And indirectly, it also bodes well for sales of Microsoft ( MSFT) and Sony's ( SNE) next-gen consoles, each of which are powered by AMD ( AMD) processors and have seen strong pre-orders ahead of November launches.