14 Comments

Summary:

Cloudera, a startup based in Burlingame, Calif., today announced the release of its first commercial product, Cloudera Desktop. It’s a graphical interface for managing Hadoop, the open-source framework that is catalyzing the data mining renaissance. Cloudera’s Hadoop now works on almost all major cloud platforms: Amazon […]

cloudera Cloudera, a startup based in Burlingame, Calif., today announced the release of its first commercial product, Cloudera Desktop. It’s a graphical interface for managing Hadoop, the open-source framework that is catalyzing the data mining renaissance. Cloudera’s Hadoop now works on almost all major cloud platforms: Amazon Web Services, Rackspace and soon, VMware’s vCloud.

In light of today’s news, as well as my recent conversations with industry insiders such as Netezza CEO Jim Baum, I realized how synonymous Cloudera has become with Hadoop, even though its three co-founders didn’t really have anything to do with its early development. I wonder if you guys see that as well?

One of the upsides of getting old is that you see history repeat itself. Or as Yogi Berra would say, it’s like déjà vu all over again. I see some interesting parallels between Hadoop and Red Hat, which rose to prominence on the back of Red Hat Linux, a version of Linux optimized for corporate users. (Related Research from GigaOM Pro, sub required: Open-Source Startups Follow Red Hat’s Path to Profit.)

Red Hat Cloudera
Offerings Red Hat Linux OS, Services Cloudera Hadoop, Data Warehouse software, Services
Open Source Linkage Linux Hadoop
Key Open Source Champion Marc Ewing Doug Cutting
Venture Investors August Captial, Graylock, Benchmark Captial Graylock, Accel Partners
Key Executive Bob Young, CEO Mike Olson, CEO

For example, Red Hat CEO Bob Young used to run a catalog business that sold Linux and Unix software accessories and books before he bought Marc Ewing’s Red Hat Linux in 1995, merged it with his ACC Corp. and named the new company Red Hat Software.

Cloudera, by comparison, was started in 2008 by Christophe Bisciglia, who created and led Google’s academic cloud computing initiative; Dr. Amr Awadallah, Yahoo’s former VP of engineering; and Jeff Hammerbacher, formerly of Facebook. The founders’ pedigree gave the startup instant credibility, which in turn allowed them to snag Mike Olson, a well-respected open source executive, as their CEO and the fourth co-founder.

In the case of Red Hat, the Young-Ewing combo enabled the company to raise $6.25 million in funding from the likes of August Capital, which allowed it to quickly scale. I see the same happening at Cloudera.

Cloudera’s talent pool has paved the way for it to raise $11 million in two rounds of funding from Accel Partners and Graylock. Cloudera’s other backers include Diane Greene (former CEO of VMware), Marten Mickos (former CEO of MySQL) and Jeff Weiner (president of LinkedIn).

Back in the late 1990s, the focus was on lowering the cost of infrastructure by moving away from proprietary software platforms to open-source operating systems. The focus today is on data and using it smartly. Unlocking the data, mining it for intelligence and analyzing it is the next big opportunity. “The web changed the way we radiate and consume information and in doing so, created a new opportunity to measure and monetize it,” writes Gary Orenstein. “The preferred architectural model for this web-derived data warehouse –- a combination of low-cost server hardware, distributed systems and open-source software — set off an innovation path that outpaced the commercial market.” Hadoop is well on its way down that path.

Hadoop was developed to support the distribution of the open-source search engine project known as Nutch, and was inspired by Google’s MapReduce and Google File System work. A top-level Apache project, it was created by Doug Cutting (and named after his child’s stuffed elephant) and championed by Yahoo, which quickly became the largest contributor to the project as it implemented the technology in its web and advertising businesses.

The big change came this past August, when Doug Cutting left Yahoo and joined Cloudera. Cutting’s involvement is like the icing on the cake, giving the company the ability to corner all the Hadoop talent out there. It also helps that Cloudera has started to make inroads into newer markets, including biotech and retail. “Hadoop is going to find potential markets in any industry where there are large data sets that need complex analysis,” CEO Olson told me.

I remember talking to Red Hat executives back in the day and listening to their pitch about Linux everywhere, how they were going to go beyond the web community and help drive Linux into other corporate environments and eventually, build a services business around it.

Cloudera is following that same path. It’s developed its own version of Hadoop, one that’s optimized for the needs of large corporations, especially those that prefer a little hand-holding from their suppliers. By giving them this version of Hadoop, Cloudera hopes to make revenue from services. And the timing — the company unveiled Cloudera Desktop at Hadoop World (we are media partners) in New York, an event it organized — is perfect.

Game, set, match for Cloudera.

Related posts from The GigaOM Network:

  1. Om, I think their first commercial offering was the enterprise version they released few weeks ago.

    Share
    1. Krish

      I am going to counter check that. Thanks for letting me know.

      Share
      1. I apologize and withdraw the statement. You are right.

        Share
      2. I confused their announcement about vCloud API integration with an enterprise product itself. I cross checked and this is their first commercial product.

        Share
  2. These guys will be filthy rich soon enough. I think we will see a lot of clones in the near future.

    Share
  3. Interesting, More action in data mining. However this round could see more traction for the
    new players rather than traditional data mining biggies . Could be interesting times to come.
    Reminds me of how the internet changed the dynamics of the computing and left many client server
    companies confused.

    Share
  4. Mohamad Afshar Sunday, October 4, 2009

    Oh for G*ds sake!

    Map/ Reduce or hadoop or whatever is only applicable for very large datasets where the shared nothing approach tends to do well [i know, i did a PhD on this topic] . Hence it’s applicable to 1% of the data management problems out there – if we are very generous. The usual suspects – social networks, defense, etc. would probably be but to name a few.

    To Compare CLoudera to Redhat – an operating system vendor which has wide applicability to the data center – for all – not just for those with massive data volumes, is preposterous!

    Share
  5. [...] this month, much to the chagrin of some of our readers I equated Hadoop-focused start-up, Cloudera to Red Hat. My argument was that in the late 1990s, the open source operating systems and web software proved [...]

    Share
  6. [...] As the Open Source software movement continues the strengthen, questions abound about where the opportunities to create commercially viable solutions. Red Hat did it with Linux. Can Cloudera do it with Hadoop? Read this GigaOm article. [...]

    Share
  7. [...] Hadoop project. Hadoop is a critical piece of Yahoo’s web infrastructure, is the basis of Cloudera’s business model, and is the foundation of products like Amazon’s Elastic MapReduce and IBM’s M2 data-processing [...]

    Share
  8. [...] addition, Cloudera is notable because it’s leveraging the proven business model that Red Hat has deployed around Linux, building a fee-based support and services infrastructure around free, open-source software. Red [...]

    Share
  9. [...] which is behind the Puppet configuration management software, announced funding, and startups like Cloudera and Opscode are also scoring venture [...]

    Share

Comments have been disabled for this post