With Rackspace granting early access to its Hadoop service on Sunday night, Cloudera announcing a handful of new cloud partners (Amazon Web Services, SoftLayer and Verizon, in addition to Savvis) on Monday and Microsoft making HDInsight on Windows Azure a reality, pretty much every infrastructure-as-a-service cloud around now offers a managed Hadoop service. Here’s a quick breakdown of who’s offering what.
|Cloud provider||Hadoop services/partners|
|Amazon Web Services||Elastic MapReduce (Cloudera (forthcoming), MapR)|
|GoGrid||GoGrid Big Data Solution (Cloudera)|
|Google Compute Engine||MapR|
|Joyent||Joyent Solution for Hadoop (Hortonworks)|
|Microsoft Windows Azure||HDInsight (Hortonworks)|
|Rackspace||Rackspace Cloud Big Data Platform (Hortonworks)|
|Savvis||Savvis Big Data Solutions (Cloudera, MapR)|
|Verizon Enterprise Solutions||Cloudera|
|Virtustream||HANA-Hadoop Managed Service (Intel)|
There are also a handful of independent Hadoop cloud services out there, either running atop AWS or hosted on their own infrastructure somewhere.
|Independent Hadoop services|
One interesting thing about all the offerings in both camps is that they’re still very command line- and MapReduce-focused, with the highest level of abstraction generally being a simple programming language like Python. I’m still waiting for the day that GUI-based Hadoop services start popping up, trying to take some of the complexity of out creating Hadoop jobs, but maybe that day will never come. Or, maybe that’s already happening at an even higher level with all the data warehouse and other analytic services already out there running atop Hadoop and not included in these lists.
If I missed any hosted Hadoop services, please do note them in the comments.
Feature image courtesy of Shutterstock user FWStudio.