Blog Post

Nvidia stakes its claim in deep learning by making its GPUs easier to program

GPU maker Nvidia has been riding a wave of renewed relevancy lately as the popularity of deep learning continues to grow. Over the weekend, the company tried to capitalize even more on the craze by releasing a set of libraries called cuDNN that can be integrated directly into popular deep learning frameworks. Nvidia promises cuDNN will help users focus more on building deep neural networks and less on optimizing the performance of their hardware.

Deep learning has become very popular among large web companies, researchers and even numerous startups as a way to improve current artificial intelligence capabilities, specifically in fields such as computer vision, text analysis and speech recognition. Many of the popular approaches — especially in computer vision — run on graphics processing units (GPUs), each of which can contains thousands of cores, in order to speed up the compute-intensive algorithms without requiring racks full of standard CPUs.

Nvidia said cuDNN, which is based on the company’s CUDA parallel-programming language, can be integrated into several deep learning frameworks in a way that’s invisible to people building the models. An Nvidia spokesperson responded to my request for more information with the following explanation:

“We worked closely with the major machine learning frameworks, like Caffe, Theano and Torch7, to ensure they could quickly and seamlessly take advantage of the power of the GPU while leaving room for further innovation. For instance, the cuDNN integration in Caffe is invisible to the end-user, requiring a simple configuration change to enable this performance. These are the key elements of the “drop-in” design.

“On a more technical level, cuDNN is a low level library that can be called from host-code without writing any CUDA code, much like our existing CUDA cuBLAS and cuFFT libraries. With cuDNN, we’re doing the work of optimizing the low-level routines used in these deep learning systems (e.g., convolutions) so that the people developing those systems need not spend their time doing so. Instead, they can focus their attention on the higher-level machine learning questions and advance the state of the art, while relying on us to make their code run faster with GPU accelerators.”

Credit: Nvidia
Credit: Nvidia

Nvidia is smart to embrace deep learning and machine learning, generally, as an avenue for future growth and a way to accomplish its longstanding goal of seeing GPUs used more widely for purposes other than rendering computer graphics. They have already been widely adopted among supercomputer architects, who often load up systems with GPUs in order to offload particular tasks that will run faster on GPUs than on CPUs.

However, a few factors might ultimately dampen the excitement over GPUs in the long run. One is the emergence of alternative architectures, such as those by IBM and a startup called Nervana Systems, built specifically to handle neural networks and deep learning workloads. Another is the possibility that existing processor architectures, including CPUs and FPGAs, will prove perfectly fine — if not better, in some cases — for running deep learning models.

And, finally, there’s the possibility that deep learning, at least at a fundamental level, will never reach mainstream proportions. Even if the algorithms do become ubiquitous in consumers’ lives, it’s conceivable many developers incorporating them into apps will be doing so via API or some other abstraction rather than building deep learning systems themselves. It’s called cloud computing.

Any spike in orders from cloud providers would certainly be good for Nvidia’s bottom line, but we’ve moved well past the glory days of a dedicated Intel processor for every application under the sun.