Nvidia Corp. (NVDA) is set to face a much tougher competitive environment in the white-hot market for server co-processors used to power artificial intelligence projects, as the likes of Intel Corp. (INTC) , AMD Inc. (AMD) , Fujitsu and Alphabet Inc./Google (GOOGL) join the fray. But the ecosystem that the GPU giant has built in recent years, together with its big ongoing R&D investments, should allow it to remain a major player in this space.
It's a basic rule of economics that when a market sees a surge in demand that leads to a small number of suppliers amassing huge profits, more suppliers will enter in hopes of getting a chunk of those profits. That's increasingly the case for the server accelerator cards used for AI projects, as a surge in AI-related investments by enterprises and cloud giants contribute to soaring sales of Nvidia's Tesla server GPUs.
Thanks partly to soaring AI-related demand, Nvidia's Datacenter product segment saw revenue rise 186% annually in the company's April quarter to $409 million, after rising 205% in the January quarter. Growth like that doesn't go unnoticed. Over the last 12 months, several other chipmakers and one cloud giant have either launched competing chips or announced plans to do so.
To understand why some of these rival products could be competitive with Tesla GPUs on a raw price/performance basis, it's important to understand what made Nvidia's chips so popular for AI workloads in the first place. Whereas server CPUs, like their PC and mobile counterparts, feature a small number of relatively powerful CPU cores -- the most powerful chip in Intel's new Xeon Scalable server CPU line has 28 cores -- GPUs can feature thousands of smaller cores that work in parallel, and which have access to to blazing-fast memory.
That gives GPUs a big edge for projects that involve a subset of AI known as deep learning. Deep learning involves training models that attempt to function much like how neurons in the human brain do to detect patterns in content such as voice, text and images, with the algorithms used by the models (like the human brain) getting better at both understanding these patterns as they take in more content and applying what they've learned to future tasks. Once an algorithm has gotten good enough, it can be used against real-world content in an activity known as inference.
Inference algorithms don't always require a ton of processing power. GPUs can do a good job of handling them, and Nvidia has certainly been trying to grow its exposure to this field, but a lot of server-side inference work is still done using Intel's Xeon CPUs. And Apple Inc. (AAPL) , citing privacy concerns, prefers to run AI algorithms against user data directly on iOS devices.
However, training a deep learning model to create an algorithm that's good at making sense of the data it's shown -- for example, translating text or detecting stop signs and traffic lights for an autonomous driving system -- can be very computationally demanding. In training, thousands or even millions of artificial neurons split into "layers" responsible for different tasks communicate with neurons on another layer to gauge the likelihood that a particular judgment made about the data being analyzed (e.g., whether an image shows a stop sign) is accurate.
By using clusters of Tesla GPUs that each have thousands of cores to split up the work of all these artificial neurons working in parallel, AI researchers can train a deep learning model much faster than they could using server CPUs that have less than 30 cores. It also helps that Nvidia's high-end Tesla GPUs are good at the kind of complex math that deep learning algorithms perform, and can provide a model with tons of memory bandwidth and a high-speed chip-to-chip interconnect (known as NVLink) for communication.
But this doesn't mean that GPUs are the only kind of processor well-suited for training a deep learning model. In theory, a chipmaker could develop an ASIC with thousands of cores optimized for handling deep learning algorithms, and can communicate with its memory and other ASICs at high speeds.
Intel seems to have just that idea. With the help of the technology and talent it gained from last year's acquisition of startup Nervana Systems, the chip giant is prepping Lake Crest, a deep learning ASIC that it promises will have "unprecedented levels" of parallel processing abilities and "more raw computing more than today's start-of-the-art GPUs" (a clear reference to Nvidia). Lake Crest, due later this year, is also said to to support 1 terabyte per second of memory bandwidth and to rely on an interconnect that's up to 20 times faster than standard PCI Express links.
Down the line, Intel also wants to launch Knights Crest, a product for its Xeon Phi co-processor line that integrates Nervana's technology. Xeon Phi chips are already often used for less complex AI projects, as well as for analytics and high-performance computing (HPC) workloads.
Google, meanwhile, is two months removed from launching a second-generation version of its Tensor Processing Unit (TPU). Whereas Google's first-gen TPU (launched in 2016) was only meant for inference, the second-gen version, which exists as a 4-chip module that can be clustered by the dozens, is also capable of training deep learning models.
Google claims that a TPU module can deliver a whopping 180 teraflops (TFLOPs) of performance, surpassing the 120 TFLOPs Nvidia's recently-launched Tesla V100 flagship server GPU can deliver for certain deep learning operations. However, it's only meant to be used with Google's TensorFlow software framework, which is just one of several popular deep learning frameworks. In addition, Google has no plans to directly sell the TPU: It only plans to use the module (along with Nvidia GPUs) for internal AI projects, and provide cloud infrastructure clients with access to them.
AMD, for its part, recently launch its Radeon Instinct server GPU line. The most powerful GPU in the family, the MI25, isn't a bad first effort, but trails the V100 in terms of raw performance -- especially when it comes to 64-bit (double precision) calculations -- and also has less memory bandwidth. It looks like AMD is at least a year away from seriously challenging Nvidia's Tesla line.
And last week, Fujitsu unveiled plans to launch the Deep Learning Unit (DLU), an ASIC due in the company's fiscal 2018 (ends in March 2019). Details about the chip are limited for now, but Fujitsu promises superior power efficiency relative to rival chips and says the DLU will support a proprietary high-speed interconnect.
As Nvidia takes on all these new rivals, its considerable GPU R&D spending -- both for creating new GPU architectures that span many types of products, and for creating deep learning GPUs in particular -- helps its cause. In the case of the Tesla V100, based on Nvidia's new Volta GPU architecture, the company optimized the chip for AI projects by pairing 5,120 of its traditional CUDA GPU cores with 640 "tensor cores" dedicated to deep learning work. The PR announcing the chip features quotes from Google, Microsoft, Facebook and Amazon execs signaling their eagerness to use it.
But over the long run, Nvidia's biggest competitive advantage will likely be the AI ecosystem that has formed around Tesla GPUs thanks to the company's head-start. Developers have grown accustomed to using Nvidia's CUDA GPU programming interface (API). As well as the various tools in the company Deep Learning software development kit (SDK), which include its cuDNN software library for accelerating the performance of deep learning frameworks relying on Tesla GPUs. It's common for deep learning job listings to mention experience with CUDA as a requirement.
Nvidia's ecosystem also extends to alliances with tech giants. In April, Nvidia and Facebook Inc. (FB) announced that they've worked to optimize Facebook's Caffe2 AI software framework for Tesla GPUs. And not long afterwards, Microsoft Corp. (MSFT) disclosed that the next version of its SQL Server database will be able to work with Tesla GPUs installed within a database server to handle deep learning jobs.
Throw in how Nvidia and its rivals are battling for a rapidly-expanding pie, and the company still seems well-positioned to profit as more and more companies get AI religion. Even if companies like Intel and Google wind up grabbing a piece of that pie.