Blog Post

Why we’re all so obsessed with deep learning

Stay on Top of Enterprise Technology Trends

Get updates impacting your industry from our GigaOm Research Community
Join the Community!

You might have noticed the flurry of activity lately around deep learning. It’s an approach to data analysis centered around stacks of artificial neural networks that, for lack of a more succinct definition, can teach themselves to understand complex patterns and the many little features that comprise the data they’re on which they’re trained. It’s the talk of the town among media types, entrepreneurs and computer scientists not just because it sounds so cool, but mostly because it works.

We’ve covered many of its early applications already — recognizing what’s in pictures, who’s in pictures, how words are related in text and what people are saying. A lot of the research being done in universities, which then gets trained on massive amounts of web data inside places like Google, Microsoft and Facebook, is already making its way into consumer services and even commercial software near you. Google is using neural networks to understand and improve data center efficiency. Some believe deep learning could also be used to analyze time-series data for algorithmic trading models or better understand medical records.

Microsoft’s Skype Translate, a demo of which is below, is an example of applied deep learning.

[protected-iframe id=”55ac2922f7e5c6ea4fd4830c02f48848-14960843-6578147″ info=”” width=”448″ height=”252″ frameborder=”0″ allowfullscreen=””]

Perhaps deep learning methods could even help answer the U.S. Secret Service’s request for a software package that can recognize sarcasm in social media posts. That capability, which was part of a broader request for social-media analysis software, was the subject of a fair amount of skepticism and downright derision this week for a variety of reasons. Some questioned the agency’s motives, others its sense of humor and others yet the feasibility of automatically detecting sarcasm.

However, Richard Socher, a Stanford Ph.D. candidate who specializes in applying deep learning models to sentiment analysis (he was lead author of this paper and helped launch a web service called etcML), thinks that given the right model and the right training data, sarcasm detection might be more possible than some think. He explained how via email:

“For example: If the algorithm knew that certain things are negative (from a training set) it could find sarcasm in “I love getting up early” or “Sure, I enjoy spending all day on my homework” because they have a stark contrast and a similar pattern (saying something positive about something negative). Other indicators and structures could be picked up by an algorithm as potential indicators like “sure, yea, totally” or patterns like “something positive followed by FML”. Models like recursive neural networks that have an understanding of word order could learn such patterns if they are trained on some such examples.”

The tweet below is an example of a stark contrast between positive and negative phrases.

However, he added, other types of sarcasm are nearly impossible for computer to detect, or even many humans, absent deeper knowledge about the speaker. A statement such as “I love coding on the weekends” doesn’t include any inherently negative language and might well be true for many people.

When I spoke with an Australian researcher David Milne recently about an effort, called We Feel, to track the use of emotional words on Twitter, he noted that same concern with a separate project he’s working on to determine whether tweets about depression or suicide are legitimate or sarcastic. Because standard natural-language processing techniques won’t always pick up on sarcasm in language, Milne explained how his team tries to add context to questionable tweets by analyzing that users’ previous and subsequent tweets, as well as any replies and the users’ connections.

Socher suggested that another problem — especially for the Secret Service — might be in detecting outliers, because so few people actually follow through on dumb statements or even threats. It’s the same problem the FBI had when trying to discern patterns that signal someone might be an insider threat. When the majority of people don’t do something, it can be difficult even for machine learning algorithms to detect meaningful patterns among the few who do. So the FBI focuses on the behaviors of individual employees and looks for deviations from their individual baselines.

With more research and some more advanced models, though, Socher thinks it might also be possible to automate this type of assessment:

“I think eventually algorithms that incorporate a user “vector” or other kinds of user models may be able to distinguish sarcastic statements for very prolific users. One such indicator that could help would be how “out of the ordinary” a certain statement would be for a given user. But for this to work well, we would need a LOT of training data.”

But that’s also the beauty of deep learning at this moment in time where we have access to exponentially growing amounts of digital data and cheap, powerful processors. It’s not perfect, it’s not always easy and it’s certainly not the right tool for every job. Where it works, though, deep learning has proven to work remarkably well compared with previous approaches at solving some very challenging problems. And it’s pointing researchers in the right direction to solve others.

We could call it anything; it could be modeled after the interlocking joints in my laminate flooring rather than neurons in the brain. Yet as long as it keeps producing results, we’re going to see a lot more deep learning research, a lot more startups trying to capitalize on it, and a lot more press writing about the field that’s taking us beyond nebulous discussions about “big data” and “uncovering insights” and into discussions about actually putting intelligent systems to work for us.

Feature image courtesy of Shutterstock user Sebastian Kau.

3 Responses to “Why we’re all so obsessed with deep learning”

  1. totencough

    While the technology under the hood of deep learning machines is way over my head, I can say that what fascinates me is the ability to make a machine do more than what we generally think is possible.

    Our human minds can always be blown by something new that we’ve never known before or by something that’s in stark contrast to what we’ve always thought.

    But then, there’s the implications of what this might mean. Imagine Skype using deep learning for this translation in order for customer service departments to communicate with customers and users worldwide. This could negate the need for 3rd party translators or poor communication with customer service reps from other countries than your own.

    The opportunities in CRM and customer support alone are huge, without even acknowledging the rest of the tech world.

    Brad Hodson

    • Derrick Harris

      He makes some good points, such as:

      “In 2012, when deep neural network algorithms proved to be the best algorithm for understanding complex sensory data10-20%, instead of the 1-2% typical year by year improvement.
      This is a big difference in algorithm performance in complex data. I have not seen such large improvement in all my career.
      Imagine a 100-meter dash athlete run in 7.5 seconds, beating everyone else by 2 full seconds when a typical record was usually just 0.1 seconds better before. Wouldn’t you be surprised? ‘Almost unreal!'”

      As the algorithms get commercialized, I think the accuracy improvements achieved by researchers — which are what got a lot of companies excited — tend to get glossed over. In fact, I should have included a stat or two to prove my point.