9 Comments

Summary:

There are a few widespread misconceptions about Cloudera, the promising, well-funded Burlingame, Calif.-based startup that offers services, training and support for the open-source software framework Hadoop. At least that’s what I found out during a talk earlier today with the company’s CEO, Mike Olson.

There are a few widespread misconceptions about Cloudera, the promising, well-funded Burlingame, Calif.-based startup that offers services, training and support for the open-source software framework Hadoop. At least that’s what I found out during a talk earlier today with the company’s CEO, Mike Olson.

New Horizons for Hadoop

Cloudera, now just over a year old after launching in October of 2008, has remained a buzz-worthy startup for a number of reasons. One is that the company employs heavy-hitting folks who helped build Hadoop, such as Doug Cutting. Another is that it offers much-in-demand support for Hadoop, which has moved well beyond its roots as an Apache-driven open-source platform powering hugely scalable search technology at companies such as Yahoo to many new kinds of complex data query tasks of interest to businesses and organizations of all stripes. Hadoop is emerging as a powerful tool for sifting extensive data sets in newly useful ways. In my talk with Olson, he confirmed that Cloudera still sees a lot of “pent-up demand” from companies that want to leverage Hadoop’s power but need help understanding it and using it.

In addition, Cloudera is notable because it’s leveraging the proven business model that Red Hat has deployed around Linux, building a fee-based support and services infrastructure around free, open-source software. Red Hat emerged as one of the big software winners during the recession with such an approach.

But Olson delivered a surprise when he said that it’s wrong to assume that his company is solely focused on open source software. On the contrary, Cloudera will diversify out of a strategy focused solely on it. “Either this quarter or next we will offer an enterprise software bundle consisting of proprietary enhancements for Hadoop users,” Olson said. “Our proprietary apps will complement the open source core, and, like Facebook and Yahoo, we continue to have core committers to Hadoop.”

Cloudera already offers its own distribution of Hadoop, which is downloadable for free, as well as its own proprietary Cloudera Desktop software consisting of dashboard and management tools for Hadoop users. Cloudera Desktop is also currently free, but Olson made clear that, going forward, his company will focus on both free, open-source software and fee-based proprietary software. The enterprise bundle will be the company’s first foray into fee-based software.

Big Data? Try Medium

Also on the surprise front, Olson doesn’t entirely embrace the idea of “Big Data” which I suggested is currently the driver of Hadoop’s success. “When I hear that term I think that must be a Google thing,” he said. “What about Medium Data? We like to say that Facebook doesn’t run Hadoop because it has a lot of data, but that Facebook has a lot of data because it runs Hadoop. Businesses that use Hadoop find that keeping data is worthwhile because Hadoop helps them process it in new ways.” Olson confirmed that Cloudera is working with plenty of large firms in possession of huge data sets, but is also working with smaller ones.

So who is doing what with Cloudera’s Hadoop distribution? According to Olson, Hadoop usage is extending way beyond just searching data. “We see people interested in it for crunching genomics data, retailers and financial institutions interested in it for processing large sets of transactions, and interest from the health-care and energy industries,” he said. You can find discussion of many use cases for Hadoop in these videos.

Cloudera, like many small companies focused on innovative open source-centric strategies and many small companies focused on the cloud, is often cited as an acquisition target. Olson told me, however, that his company has a shot at remaining a long-term standalone outfit. “We have a reasonable chance of doing it,” he said, while confirming that the company may eventually pursue an IPO. “We aren’t actively talking to anyone about any type of merger.”

Patent Shmatent

I also asked Olson about Google’s recent move to patent the MapReduce algorithm for working with large data sets that underlies Google searches. Hadoop is based on a variant of MapReduce, and there have been suggestions made that everyone using Hadoop or MapReduce is in danger following Google’s patents. As we noted here, Hadoop really isn’t threatened, though. “Google has no track record of using patents offensively,” Olson noted.

It will be interesting to see what happens to Cloudera as cloud computing and Hadoop-driven data crunching march forward. Despite the focus on staying independent that company founders cite — and I’m convinced they are focused on that — I wouldn’t be surprised to see it get picked up by a larger company.

Related content from GigaOM Pro (sub req’d):

Yahoo Still Emceeing a Growing Hadoop Lovefest

  1. [...] GigaOM What You Didn’t Know About Cloudera [...]

    Share
  2. Sounds like a great company. Do they have any customers?

    Share
  3. Very interesting article, Sebastian.

    Anyone interested in MapReduce, big data management and big data analytics (or medium data :) you should check out the Big Data Summit next week in Burlingame, CA. You can register at http://bit.ly/5KUX01. This is the premier conference on data warehousing and big data analytics. Learn how leading companies are leveraging technologies like Hadoop and MapReduce to turn data into dollars. Hear from Aster Data customers like Intuit and Mobclix and leading analysts on new technologies and trends in big data management and advanced analytics.

    Share
  4. @bob — Cloudera has lots of customers, ranging from big, recognizable companies like Netflix to smaller organizations.

    Sebastian

    Share
  5. [...] powerful functionality based on Free software (Hadoop). Based on this new interview, Cloudera is aware of the supposed violation in MapReduce, but its response is that “Google has no track record of using patents offensively.” [...]

    Share
  6. [...] about the concept of big data is kind of like trying to write about water. Water is essential, touches so many aspects of life [...]

    Share
  7. [...] about the concept of big data is kind of like trying to write about water. Water is essential, touches so many aspects of life [...]

    Share
  8. hi …
    Google with the sophistication of the engine has a lot of protests. Many companies have felt privacy was violated. but so google it has contributed significantly to the development of internet. especially the development of opensource.
    thanks see ya tomorrow!

    Share
  9. [...] will play out. Cloudera offers its own commercial Hadoop distribution and support services, and plans to release proprietary products in the near future. Karmasphere offers a desktop-based product for building, deploying and managing [...]

    Share

Comments have been disabled for this post