The Gigaom guide to deep learning: Who’s doing it, and why it matters


The field of deep learning is picking up steam to the point that it’s now inspiring a growing list of startups in areas such as natural language processing and image recognition. It’s also commanding a growing percentage of research and acquisition budgets at companies such as Google, Microsoft, Facebook and Yahoo. This post highlights some of the companies involved in this space and the type of products or projects they’re working on.

What deep learning is

First, though, a little primer: Despite its cognitive moniker, deep learning isn’t really about teaching machines to mimic the human brain a la the BRAIN Initiative President Obama announced in 2012. Rather, it’s about teaching machines to think more hierarchically or more contextually — to see a picture of a mole, for example, and work down from recognizing the features that comprise an animal to recognizing the specific features that make it a mole. With text, the process might be teaching machines to recognize how words are related to one another and how they fit together to form phrases or express ideas.

One layer of a Google deep learning network for image recognition.

One layer of a Google deep learning network for image recognition. Source: Google

And how a visualization of the output of network might look. Source: Google

And how a visualization of the output of network might look, with a cat (left) or human body (right). Source: Google

And while deep learning might hold huge promises in fields such medicine and astronomy, the best we can probably hope for in the near term are more-accurate text messages, search engines, language translation and targeted content. Why? Because that’s where the money is today and also where the data is.

In that sense, deep learning as a field is a lot like its bigger brother, big data: The best techniques might start off in (or be adopted from universities by) Silicon Valley offices solving arguably menial problems, but the big payoff will come when they make their way into new fields such as science and medicine. What’s more, deep learning models are quite computationally intensive, but the increasingly inexpensive, increasingly powerful resources available via cloud computing mean a lot more people without data centers at their disposal can get in on the action.

With that in mind, here’s a brief look at some of the companies, people and places I’ve come across in researching the field. I know this isn’t exhaustive, so please note any other notable people, companies or projects in the comments. We will try to update these lists periodically, as well.

The startups

AlchemyAPIDenver, Colo.-based AlchemyAPI has been providing deep-learning-based text analysis via API for a while now, and recently showed off its work on image recognition, as well. A whole lot of companies you’ve heard of already use the service for tasks such as sentiment analysis, categorization and tagging. AlchemyAPI is working on new capabilities that should make it much more useful, even for folks who have no interest in consuming services via API.

Keyword extraction and sentiment analysis from a demo of AlchemyAPI analyzing a Gigaom post on the FCC.

Keyword extraction and sentiment analysis from a demo of AlchemyAPI analyzing a Gigaom post on the FCC.

Cortica: Cortica is an image-recognition specialist whose technology is modeled after the way the human brain processes images in the cortical neural network. In fact, Co-founder and CEO Igal Raichelgauz told me, its technology actually comes from laboratory research he did at the Israel Institute of Technology using real brain tissue. The company is already selling its technology as a product to publishers and advertisers that want to display ads or content related to on-screen images.

An example of how Cortica recognizes images and ties ot ad inventory.

An example of how Cortica recognizes images and ties ot ad inventory.

Ersatz: Ersatz is a deep learning platform developed by a San Francisco-based consulting firm called Blackcloud BSG. Rather than providing pre-determined capabilites such as image recognition or sentiment analysis, Ersatz provides the tools, kind of like an Amazon Web Services for deep learning. It gives users a web interface, API, GPU-based cloud resources and neural network implementations, and lets users go to town training and running their own models as they please. Alternatively, customers can just pay parent company Blackcloud to do everything for them.

A confusion matrix displayed on the Ersatz user interface.

A confusion matrix displayed on the Ersatz user interface.

SemantriaAmherst, Mass.-based Semantria is a spinoff of text-analysis veteran Lexalytics, only it’s delivered via API — or Excel plugin — rather than installed software. According to Founder and CEO Oleg Rogynskyy, Semantria is also improving its accuracy by incorporating more deep learning methods and expanding its data sources beyond Wikipedia (which is where the Lexalytics engine derives its semantic knowledge).

A sample of Semantria's entity extraction and sentiment analysis capabilities.

A sample of Semantria’s entity extraction and sentiment analysis capabilities.

I suspect these companies are just a small percentage of what we’ll see pop in the next couple of years, though. Some well-known deep learning scholars in fields such as natural language processing and image recognition already have stealth-mode startups in the works, and it wouldn’t be surprising to see other vision-based startups such as Dropcam and Sight Machine adopt these techniques as they mature their products. There are other startups, too — such as Palm creator Jeff Hawkins’s new startup called Grok that straddle the line between deep learning concepts and actual brain-like computing architectures — that could push the field into new areas.

The big companies

Facebook: Facebook is a relative newcomer to the field, but certainly has plenty of text and images to analyze. The company is reportedly hoping deep learning will help it better optimize its NewsFeed application and allow for a more compelling photo-sharing experience. However, Facebook’s biggest contribution to the deep learning field might be its penchant for building and open sourcing specialized hardware — something that could come in handy for others trying to build infrastructure that can handle these workloads.

Google: Google is probably the most well-known company in the deep learning field, thanks in part to its highly publicized image-recognition research (its models were able to identify cats and human faces without any training) and its recent decision to open source some new tools for text analysis. It’s able to do so much in part because it has lots of computing resources, lots of smart people and lots of data. Deep learning is already powering speech recognition on Android phones and you can search your Google+ photos by content without ever having tagged them. 

IBM: You’ve heard of Watson, right? The IBM system that defeated Jeopardy! champions and is now being applied to fields such as health care is an amalgmation of numerous data-analysis techniques, including deep learning. Beyond that, IBM is also pushing a number of efforts around what it calls “cognitive computing,” including a partnership with four leading universities that includes deep learning as a core research area.

Microsoft: Microsoft Research has been at the big data game for a long time, and last November then-director Rich Rashid made headlines when he showed off its deep learning prowess by demonstrating a live English-to-Mandarin translation tool during an event in China. Like Google, Microsoft is collecting all sorts of data from its various web and mobile applications, and hopes deep learning can help it provide more compelling experiences on its web, mobile, gaming and even business-software platforms — although its ace in the hole might be Kinect.

Yahoo: Yahoo doesn’t get the attention that Google and Microsoft do, but it has snapped up two deep-learning-based image-recognition startups — IQ Engines and LookFlow — in the past few months. Yahoo has all sorts of ways it could utilize these sorts of technologies and people, but the obvious tactic is making Flickr more appealing by making it smarter and perhaps setting it loose on smartphone picture galleries.

The researchers

The Toronto crew: In many ways, the University of Toronto — specifically a research group led by Geoffrey Hinton — is the reason there’s so much talk about deep learning today. The team made some big breakthroughs several years ago and ultimately launched a startup that Google acquired in 2013. Hinton now splits his time between the University of Toronto and Google. Marc’Aurelio Ranzato was a member of Hinton’s lab (and earned his Ph.D. from NYU) and recently left Google to be part of Facebook’s deep learning team.

The Stanford crew: Stanford is one of the hotbeds of deep learning research, with professor and Coursera founder Andrew Ng among the biggest names in the space and Christopher Manning heading up the school’s NLP program. We recently covered research by a member of his team, Ph.D. candidate Richard Socher, that focuses on understanding whole sentences rather than single words and is 85 percent accurate in analyzing the sentiment of movie reviews from a popular dataset. 

A visual representation of how Socher’s model breaks down sentences.

A visual representation of how Socher’s model breaks down sentences.

The NYU crew: Yann LeCun and Rob Fergus of NYU are both respected deep learning researchers and experts in image recognition, with LeCun having done some seminal work on handwriting analysis in the ’90s and Fergus currently doing work on making neural networks less like black boxes. NYU is also the university heading up deep learning research as part of the aforementioned IBM cognitive computing partnership.

I tested a demo of Fergus's work. Not too shabby considering it's based on a predefined set of categories that probably doesn't include Larry Bird.

I tested a demo of Fergus’s work. Not bad considering it’s based on a predefined set of categories that likely doesn’t include Larry Bird.

The Montreal crew: The LISA Lab at the University of Montreal, led by Yoshua Bengio, is another major center for deep learning research, and created a popular open source library called Theano (you can download it here) that makes it easier to complex mathematical algorithms in Python and run them on GPUs (which are ideal for neural networks because of their parallelism). And, bonus, one of Theano’s creators, Joseph Turian, is also a Gigaom Research analyst.

The Switzerland crew: Juergen Schmidhuber and his team at the Dalle Molle Institute of Artificial Intelligence have been doing deep learning since 1991, and now that processing power is catching up with the models they’ve developed, their work has been taken to the next level. The team has won numerous competitions, including one on recognizing features of human brains.

Feature image courtesy of Shutterstock user Willy Deganello.


Derrick Harris

Companies I intend to add to this list soon include Vicarious, DeepMind and perhaps Dropbox (which I understand is doing some work in computer vision). Institutions include Harvard and MIT. Any more suggestions welcome.


Good that parts of my previous comment became obsolete.
The article also claims that Hinton’s team “made some big breakthroughs several years ago.”
Which breakthroughs exactly?
Their Science paper (2006) reports only mediocre results on MNIST, the standard benchmark.
Nobody is using their method in today’s applications and competitions.
It did not even influence the current state-of-the-art.
Hinton’s team is not using it either.
Instead they are now using networks (very similar to those) of Schmidhuber’s team.
That’s how they got best results on TIMIT speech data with recurrent networks (ICASSP 2013) and on ImageNet with feedforward networks (NIPS 2012).
In the feedforward case, Schmidhuber is also giving credit to Fukushima and LeCun’s group and others including the people who invented backpropagation around 1970.

Bobby Briggs

Most of these companies are actually working on deep learning. Are we just using that as a buzzword for the latest AI/ML/etc?


It must be a joke that the team of Juergen Schmidhuber is not mentioned.
They basically started Deep Learning research in 1991.
They were the first to win international pattern recognition competitions.
They have won more competitions than anybody else (nine as of 2013).
And the others mentioned in the article are actually using their Deep Learning methods!
But Schmidhuber is also giving credit to many other people since 1962:


Great post — but I’m starting to think “deep learning” is going the way of “big data”, with companies using it for marketing while having no idea what it actually means.

Hadn’t heard of Cortica, but Semantria? seriously? They’re a *reseller* of 1990s-era NLP tech (maxent, CRF, and regex). Maybe companies should be required to show proof (for big data, too! I saw a freaking MS Access toolkit using “big data” marketing speak the other day)

BTW there are other companies not mentioned here using deep learning. But we’ll release ours when its ready and not make things up to get press :)

Dave Sullivan

Wow Derrick, you’ve definitely done your homework–it’s crazy to see all the people/companies on that list, the industry is growing fast… And thanks for plug!


Interesting list and great article. Spot on in saying that it’s in bid textual datasets that the action is today.

Maybe would be relevant? Ontology free auto-categorisation engine. API indeed.

Ian Goodfellow

Why not mention Yoshua Bengio, the leader of the Montreal lab, by name? You mention the professors at the other universities by name and give the name of one of Yoshua’s former postdocs, why not Yoshua himself?


One additional “crew”: Schmidhuber’s group at the Swiss AI Lab.


There are other groups of researchers here and there working on deep learning as well.

Here is a nice-sounding piece of deep learning – related research by some people at MIT:

And, although you mentioned Microsoft, I thought I would mention this nice bit of research:

They say in the paper that the model IMPLICITLY learns continuous vector space embeddings for the meanings of words, and conjecture that that accounts for the high F1 score. The method does tagging and vector-embeddings all in one unified system.

Comments are closed.