Facebook open sources tools for bigger, faster deep learning models

5 Comments

Facebook on Friday open sourced a handful of software libraries that it claims will help users build bigger, faster deep learning models than existing tools allow.

The libraries, which [company]Facebook[/company] is calling modules, are alternatives for the default ones in a popular machine learning development environment called Torch, and are optimized to run on [company]Nvidia[/company] graphics processing units. Among the modules are those designed to rapidly speed up training for large computer vision systems (nearly 24 times, in some cases), to train systems on potentially millions of different classes (e.g., predicting whether a word will appear across a large number of documents, or whether a picture was taken in any city anywhere), and an optimized method for building language models and word embeddings (e.g., knowing how different words are related to each other).

“‘[T]here is no way you can use anything existing” to achieve some of these results, said Soumith Chintala, an engineer with Facebook Artificial Intelligence Research.

That team was formed in December 2013 when Facebook hired prominent New York University researcher Yann LeCun to run it. Rob Fergus, one of LeCun’s NYU colleagues who also joined Facebook at the same time, will be speaking on March 19 at our Structure Data conference in New York.

A heatmap showing performance of Facebook's modules to standard ones on datasets of various sizes. The darker the green, the faster Facebook was.

A heatmap showing performance of Facebook’s modules to standard ones on datasets of various sizes. The darker the green, the faster Facebook was.

Despite the sometimes significant improvements in speed and scale, however, the new Facebook modules probably are “not going to be super impactful in terms of today’s use cases,” Chintala said. While they might produce noticeable improvements within most companies’ or research teams’ deep learning environments, he explained, they’ll really make a difference (and justify making the switch) when more folks are working on stuff at a scale like Facebook is now — “using models that people [previously] thought were not possible.”

Perhaps the bigger and more important picture now, then, is that Friday’s open source releases represent the start of a broader Facebook effort to open up its deep learning research the way it has opened up its work on webscale software and data centers. “We are actually going to start building things in the open,” Chintala said, releasing a steady stream of code instead of just the occasional big breakthrough.

Facebook is also working fairly closely with Nvidia to rework some of its deep learning programming libraries to work at web scale, he added. Although it’s working at a scale beyond many mainstream deep learning efforts and its researchers change directions faster than would be feasible for a commercial vendor, Facebook’s advances could find their way into future releases of Nvidia’s libraries.

Given the excitement around deep learning right now — for everything from photo albums to self-driving cars — it’s a big deal that more and better open source code is becoming available. Facebook joins projects such as Torch (which it uses), Caffe and the Deeplearning4j framework being pushed by startup Skymind. Google has also been active in releasing certain tooling and datasets ideal for training models.

It was open source software that helped make general big data platforms, using software such as Hadoop and Kafka, a reality outside of cutting-edge web companies. Open source might help the same thing happen with deep learning, too — scaling it beyond the advances of leading labs at Facebook, Google, Baidu and Microsoft.

5 Comments

jobs.ai

It’s interesting to see how Nvidia now keeps talking about deep learning as if they planned for a long time to make deep learning their core business :)

A B

Elsewhere it has been noted that the IP terms wrt patents in Facebook’s released code are pretty onerous.

Xose

Just a point out: the tooling you are referring to (word2vec) is NOT released by Google and has nothing to do with them, as Mikolov et al. claim in the Google Code repository. They only use Google Code to release the results of their research instead the more commonly used Github.

John H.

Mikolov was working for Google at the time he created word2vec. He was paid by Google to create it. And it was Google decision to open source it. Well Google do love open source – contributed a lot to things like Linux/OpenOffice/MySQL, and sure there is Android and Chrome and V8 JavaScript Engine etc and so on.

Comments are closed.