5 Comments

Summary:

If you just pay attention to largest Hadoop users, you might think the platform is just a way of powering search engines or analyzing customer behavior for ad-serving. Of course that’s not the case, but finding those broader use cases can still be kind of difficult.

shutterstock_72069451

If you just pay attention to the world’s largest Hadoop users, you might think the platform is just a better technology for powering search engines or analyzing customer behavior for ad-serving. Of course that’s not the case, but finding those broader use cases can still be kind of difficult. That’s too bad, because the more we highlight what’s possible, the easier it will be to discover entirely new uses.

Today, at a launch event for the latest version of its namesake Hadoop distribution, Cloudera COO Kirk Dunn as well as a few panelists noted some of those use cases. I’ve uncovered a few throughout my years of covering the Hadoop space, too. With that in mind, here are 10 uses cases but I know there are a lot more floating about — feel free to share them in the comments.

  1. Online travel. Dunn noted that Cloudera’s Hadoop distribution currently powers about 80 percent of all online travel booked worldwide. He didn’t mention users by name, but last year I covered how one of those customers, Orbitz Worldwide, uses Hadoop.
  2. Mobile data. This another of Dunn’s anonymous statistics — that Cloudera powers “70 percent of all smartphones in the U.S.” I assume he’s talking about the storage and processing of mobile data by wireless providers, and a little market-share math probably could help one pinpoint the customers.
  3. E-commerce. More anonymity, but Dunn says Cloudera powers more than 10 million online merchants in the United States. Dunn said one large retailer (I assume eBay, which is a major Hadoop user and manages a large marketplace of individual sellers that would help account for those 10-plus million merchants) added 3 percent to its net profits after using Hadoop for just 90 days.
  4. Energy discovery. During a panel at Cloudera’s event, a Chevron representative explained just one of many ways his company uses Hadoop: to sort and process data from ships that troll the ocean collecting seismic data that might signify the presence of oil reserves.
  5. Energy savings. At the other end of the spectrum from Chevron is Opower, which uses Hadoop to power its service that suggests ways for consumers to save money on energy bills. A representative on the panel noted that certain capabilities, such as accurate and long-term bill forecasting were hardly feasible without Hadoop.
  6. Infrastructure management. This is a rather common use case, actually, as more companies (including Etsy, which I profiled recently) are gathering and analyzing data from their servers, switches and other IT gear. At the Cloudera event, a NetApp  rep noted how his company collects device logs (it has more than a petabyte worth at present) from its entire install base and stores them in Hadoop.
  7. Image processing. A startup called Skybox Imaging is using Hadoop to store and process images from the high-definition images its satellites will regularly capture as they attempt to detect patterns of geographic change. Skybox recently raised $70 million for its efforts.
  8. Fraud detection. This is another oldie but goodie, used by both financial services organizations and intelligence agencies. One of those users, Zions Bancorporation, explained to me recently how a move to Hadoop lets it store all the data it can on customer transactions and spot anomalies that might suggest fraudulent behavior.
  9. IT security. As with infrastructure management, companies also use Hadoop to process machine-generated data that can identify malware and cyber attack patterns. Last year, we told the story of ipTrust, which uses Hadoop to assign reputation scores to IP address, which lets other security products decide whether to accept traffic from those sources.
  10. Health care. I suspect there are many ways Hadoop can benefit health care practitioners, but one of them goes back to its search roots. Last year, I profiled Apixio, which uses Hadoop to power its service that leverages semantic analysis to provide doctors, nurses and others more-relevant answers to their questions about patients’ health.

Image courtesy of Shutterstock user Johan Swanepoel.

You’re subscribed! If you like, you can update your settings

  1. The repeated use of the word “power” — “Cloudera’s Hadoop distribution currently powers about 80 percent of all online travel booked worldwide,” “Cloudera powers ’70 percent of all smartphones in the U.S.’,” “Cloudera powers… online merchants” — seems a distortion, a likely-false implication that a Hadoop instance is playing a front-line, operational role in running services and transactions. If so, the reliance on deceptive language to puff up Cloudera’s role is a shame. But if that’s not the case, if Cloudera really is running consumer services and executing transactions, I’d love to learn more.

    Seth, http://twitter.com/sethgrimes

    1. I suspect it’s the former, but w/o verification, I figured I’d keep Cloudera’s language. Still, impressive penetration.

    2. Consulting for online travel agencies in the last 5 years, really doubt the 80% figure, at least in Europe is far from that number.

  2. Great article Derrick! FYI – the Hadoop deployment at NetApp is all around better customer support / service and processes over 1 trillion (not a typo) event records (storage array telemetry) in production! One highlight is reducing a month-long SQL query down to 18 minutes.

    More details from the pilot here: http://www.slideshare.net/cloudera/hadoop-world-2011-architecting-a-businesscritical-application-in-hadoop-stephen-daniel-netapp

    1. Derrick Harris valb00 Sunday, June 10, 2012

      Very impressive.

Comments have been disabled for this post