Cloudera CEO declares victory over big data competition

4 Comments

Credit: Jakub Mosur

Cloudera CEO Tom Reilly doesn’t often mince words when it comes to describing his competition in the Hadoop space, or Cloudera’s position among those other companies. In October 2013, Reilly told me he didn’t consider Hortonworks or MapR to be Cloudera’s real competition, but rather larger data-management companies such as IBM and EMC-VMware spinoff Pivotal. And now, Reilly says, “We declare victory over at least one of our competitors.”

He was referring to Pivotal, and the Open Data Platform, or ODP, alliance it helped launched a couple weeks ago along with [company]Hortonworks[/company], [company]IBM[/company], [company]Teradata[/company] and several other big data vendors. In an interview last week, Reilly called that alliance “a ruse and, frankly, a graceful exit for Pivotal,” which laid off a number of employees working on its Hadoop distribution and is now outsourcing most of its core Hadoop development and support to Hortonworks.

You can read more from Reilly below, including his takes on Hortonworks, Hadoop revenues and Spark, as well as some expanded thoughts on the ODP. For more information about the Open Data Platform from the perspectives of the members, you can read our coverage of its launch in mid-February as well as my subsequent interview with Hortonworks CEO Rob Bearden, who explains in some detail how that alliance will work.

If you want to hear about the fast-changing, highly competitive and multi-billion-dollar business of big data straight from horses’ mouths, make sure to attend our Structure Data conference March 18 and 19 in New York. Speakers include Cloudera’s Reilly and Hortonworks’ Bearden, as well as MapR CEO John Schroeder, Databricks CEO (and Spark co-creator) Ion Stoica, and other big data executives and users, including those from large firms such as [company]Lockheed Martin[/company] and [company]Goldman Sachs[/company].

GIGAOM STRUCTURE DATA 2014

You down with ODP? No, not me

While Hortonworks explains the Open Data Platform essentially as a way for member companies to build on top of Hadoop without, I guess, formally paying Hortonworks for support or embracing its entire Hadoop distribution, Reilly describes it as little more than a marketing ploy. Aside from calling it a graceful exit for Pivotal (and, arguably, IBM), he takes issue with even calling it “open.” If the ODP were truly open, he said, companies wouldn’t have to pay for membership, Cloudera would have been invited and, when it asked about the alliance, it wouldn’t have been required to sign a non-disclosure agreement.

What’s more, Reilly isn’t certain why the ODP is really necessary technologically. It’s presently composed of four of the most mature Hadoop components, he explained, and a lot of companies are actually trying to move off of MapReduce (to Spark or other processing engines) and, in some cases, even the Hadoop Distributed File System. Hortonworks, which supplied the ODP core and presumably will handle much of the future engineering work, will be stuck doing the other members’ bidding as they decide which of several viable SQL engines and other components to include, he added.

“I don’t think we could have scripted [the Open Data Platform news] any better,” Reilly said. He added, “[T]he formation of the ODP … is a big shift in the landscape. We think it’s a shift to our advantage.”

(If you want a possibly more nuanced take on the ODP, check out this blog post by Altiscale CEO Raymie Stata. Altiscale is an ODP member, but Stata has been involved with the Apache Software Foundation and Hadoop since his days as Yahoo CTO and is a generally trustworthy source on the space.)

Hortonworks CEO Rob Bearden at Structure Data 2014.

Hortonworks CEO Rob Bearden at Structure Data 2014.

Really, Hortonworks isn’t a competitor?

Asked about the competitive landscape among Hadoop vendors, Reilly doubled down on his assessment from last October, calling Cloudera’s business model “a much more aggressive play [and] a much bolder vision” than what Hortonworks and MapR are doing. They’re often “submissive” to partners and treat Hadoop like an “add-on” rather than a focal point. If anything, Hortonworks has burdened itself by going public and by signing on to help prop up the legacy technologies that IBM and Pivotal are trying to sell, Reilly said.

Still, he added, Cloudera’s “enterprise data hub” strategy is more akin to the IBM and Pivotal business models of trying to become the centerpiece of customers’ data architectures by selling databases, analytics software and other components beside just Hadoop.

If you don’t buy that logic, Reilly has another argument that boils down to money. Cloudera earned more than $100 million last year (that’s GAAP revenue, he confirmed), while Hortonworks earned $46 million and, he suggested, MapR likely earned a similar number. Combine that with Cloudera’s huge investment from Intel in 2014 — it’s now “the largest privately funded enterprise software company in history,” Reilly said — and Cloudera owns the Hadoop space.

“We intend to take advantage” of this war chest to acquire companies and invest in new products, Reilly said. And although he wouldn’t get into specifics, he noted, “There’s no shortage of areas to look in.”

Diane Bryant, senior vice president and general manager of Intel's Data Center Group, at Structure 2014.

Diane Bryant, senior vice president and general manager of Intel’s Data Center Group, at Structure 2014.

The future is in applications

Reilly said that more than 60 percent of Cloudera sales are now “enterprise data hub” deployments, which is his way of saying its customers are becoming more cognizant of Hadoop as an application platform rather than just a tool. Yes, it can still store lots of data and transform it into something SQL databases can read, but customers are now building new applications for things like customer churn and network optimization with Hadoop as the core. Between 15 and 20 financial services companies are using Cloudera to power detect money laundering, he said, and Cloudera has trained its salesforce on a handful of the most popular use cases.

One of the technologies helping make Hadoop look a lot better for new application types is Spark, which simplifies the programming of data-processing jobs and runs them a lot faster than MapReduce does. Thanks to the YARN cluster-management framework, users can store data in Hadoop and process it using Spark, MapReduce and other processing engines. Reilly reiterated Cloudera’s big investment and big bet on Spark, saying that he expects a lot of workloads will eventually run on it.

Databricks CEO (and AMPLab co-director) Ion Stoica.

Databricks CEO (and Spark co-creator) Ion Stoica.

A year into the Intel deal and …

“It is a tremendous partnership,” Reilly said.

Cloudera and Intel’s joint engineering

Reilly added that Cloudera and Intel are also working together on new chips designed specifically for analytic workloads, which will take advantage of non-RAM memory types.

Asked whether Cloudera’s push to deploy more workloads in cloud environments is at odds with Intel’s goal to sell more chips, Reilly pointed to Intel’s recent strategy of designing chips especially for cloud computing environments. The company is operating under the assumption that data has gravity and that certain data that originates in the cloud, such as internet-of-things or sensor data, will stay there, while large enterprises will continue to store a large portion of their data locally.

Wherever they run, Reilly said, “[Intel] just wants more workloads.”

4 Comments

Alexey Grishchenko

Cloudera is for sure the leading Hadoop distribution on the market, but:
1. “Big Data” space is not limited by Hadoop, so the topic is bold. Don’t forget MPP solutions, streaming solutions that are also “Big Data”
2. Cloudera claimed a revenue of $100M last year, and given their business size I could say that:
a. They are far from being profitable
b. Their message “The future is in applications” once again confirms that Cloudera plays as a services company, which means a big bunch of this $100M is a low-margin income
3. Open Data Platform is not only about building a reference solution, its about consolidation of efforts for a big amount of companies to build a reference Hadoop stack packaging, that would compete with Cloudera on the market. The total revenue in Hadoop space of all the participants of ODP is greater than Cloudera’s one
4. Spark is striking the market of “Big Data”. Cloudera packages Spark into CDH, but to do so it partners with Databricks and sharing the revenue with them. What if Databricks would become a part of the Open Data Platform? And what if together with “Databricks Cloud” they would release “Databricks Enterprise Platform”?
In short, I think its yet too early to get this win to Cloudera, especially given the fact that big data market is still growing and most of the enterprises are just starting Hadoop adoption

Hari Sekhon

Alexey your comments on points 2a, 2b, 3 and 4 were surprisingly accurate… I see now that you work for Pivotal which explains your industry insider knowledge.

This FUD is the worst thing about the Big Data field, Cloudera haven’t beaten anybody, the time to lock up the market was in the early days when their competitors were only bootstrapping, that window has gone. Hortonworks have grown surprisingly fast beating the odds I would have given them say 2 years back, Cloudera are only slightly ahead of Hortonworks due to an extra 2 years in the field and that lead can change in a short period of time. If anything I expect Cloudera to struggle more as the Hortonworks platform continues to mature and undermine the price of their offering as well as the whole open source angle.

More importantly – I think a big point is being missed here. I told IBM and Pivotal to stop doing Hadoop and just partner and sell their unique wares on top – it’s the right thing to do – but what this ODP announcement really highlights is the gravity of the open source Hortonworks platform as becoming the standard and center of everything. It’s not because anybody cares about a particular brand or giving that brand profits – it’s because open source benefits people more – they do what’s in their own interest.

This is Hortonworks real advantage and why they are a big threat regardless of what anybody says… it’s open source gravity. If Hortonworks stay the course, I wouldn’t want to fight them (and thankfully I no longer have to since I don’t work for Cloudera any more).

Andrew V

As a life long technologist, one of the painful things I learned along the way is that marketing matters….a lot. I’d also note that this “billion dollar market” seems to be doing a few measly hundred million in sales and EVERY ONE of the companies in it is losing major money. So, while I wish Cloudera well (I have friends there), this positioning is a matter of wishful thinking.

Comments are closed.