Blog Post

It’s everywhere! The day Hadoop took over the cloud

Stay on Top of Emerging Technology Trends

Get updates impacting your industry from our GigaOm Research Community
Join the Community!

With Rackspace(s rax) granting early access to its Hadoop service on Sunday night, Cloudera announcing a handful of new cloud partners (Amazon(s amzn) Web Services, SoftLayer(s ibm) and Verizon(s vz), in addition to Savvis(s ctl)) on Monday and Microsoft(s msft) making HDInsight on Windows Azure a reality, pretty much every infrastructure-as-a-service cloud around now offers a managed Hadoop service. Here’s a quick breakdown of who’s offering what.

Cloud provider Hadoop services/partners
Amazon Web Services Elastic MapReduce (Cloudera (forthcoming), MapR)
GoGrid GoGrid Big Data Solution (Cloudera)
Google Compute Engine MapR
Joyent Joyent Solution for Hadoop (Hortonworks)
Microsoft Windows Azure HDInsight (Hortonworks)
Rackspace Rackspace Cloud Big Data Platform (Hortonworks)
Savvis Savvis Big Data Solutions (Cloudera, MapR)
SoftLayer/IBM Cloudera
Verizon Enterprise Solutions Cloudera
Virtustream HANA-Hadoop Managed Service (Intel)

There are also a handful of independent Hadoop cloud services out there, either running atop AWS or hosted on their own infrastructure somewhere.

Independent Hadoop services
IBM (BigInsights)
Mortar Data

One interesting thing about all the offerings in both camps is that they’re still very command line- and MapReduce-focused, with the highest level of abstraction generally being a simple programming language like Python. I’m still waiting for the day that GUI-based Hadoop services start popping up, trying to take some of the complexity of out creating Hadoop jobs, but maybe that day will never come. Or, maybe that’s already happening at an even higher level with all the data warehouse and other analytic services already out there running atop Hadoop and not included in these lists.

If I missed any hosted Hadoop services, please do note them in the comments.

Feature image courtesy of Shutterstock user FWStudio.

11 Responses to “It’s everywhere! The day Hadoop took over the cloud”

  1. Hi Derrick, Xplenty offers Hadoop as a Service on the cloud (supporting AWS and Softlayer), allowing data and BI users to utilize Hadoop without the need to learn a new skillset. Xplenty allows to provision Hadoop clusters with a single click as well as developing data flows w/o writing complex map reduce code.

    • Dave Fellows

      It’s an elegant solution and certainly helps make Hadoop more accessible to the masses. Ambari (also open source) is a similarly elegant UI for cluster management/monitoring (exposes Ganglia telemetry in pretty graphs and Nagios alerts).

  2. Dave Fellows

    Speaking of Hadoop UI, GreenButton supports Hadoop as a Service and exposes Ambari as well as Hue for creating/managing MR, Hive, Pig, Oozie etc. It also has a Job Designer for more complex workflows – it’s pretty nice! GreenButton is cloud-agnostic and can manage workloads across private and public clouds.

  3. Savanna – is an OpenStack project bringing different Hadoop distributions to OpenStack cloud. It provdes a plugin for Horizon (OpenStack dashboard) wich allows to configure and launch various clusters, currently it is vanilla Apache Hadoop and Hortonworks Data Platform.
    It also allows to launch Hadoop jobs from UI. Currently it is basic mapreduce jobs (from jar files), Pig and Hive scripts.


    • Derrick Harris


      Thanks for the comment. I probably should have been clearer that buy GUI I really meant a whole UX designed for the vaunted business user as opposed to already-skilled Hadoop user. Admittedly, it has been a while since I’ve seen Qubole in action, but I don’t recall that being the experience.

      Re: Treasure Data, I’ve covered it before, but it’s very much a data warehouse play if memory suits me. Correct?

      • Joydeep Sen Sarma

        fair enough. i also sort of realized that you probably meant something different after putting the comment in.

        treasure data is positioned as a data warehouse service – but it is definitely a hadoop/hive based data warehouse. so while it may not be a classic ‘hadoop as a service’ – but practically speaking it will compete with others on this list (and us) for the same set of customers.