2 Comments

Summary:

Amazon Web Services already has a winner with its Elastic MapReduce Hadoop service, and now it’s turning up the heat by adding MapR’s Hadoop distribution as an option. Users can take advantage of MapR’s performance features while also having integration with AWS’s suite of cloud services.

shutterstock_102937379

Amazon Web Services already has a winner with its Elastic MapReduce Hadoop service, and now it’s turning up the heat by adding MapR’s Hadoop distribution as an option. Just over a year after launching, MapR has made a name for itself in the Hadoop space by offering proprietary storage that it says can outperform Apache’s Hadoop Distributed File System by up to 20 times. A lot of cloud computing users running or considering running Hadoop workloads on Amazon’s platform might soon be a lot happier.

To be clear, this isn’t Amazon just supporting MapR on Elastic MapReduce, but actually offering MapR as a managed service. Instances running MapR’s M3 edition will be available at no additional cost (like the standard Amazon instance) while instances running MapR’s “enterprise-grade” M5 edition will come with what MapR VP of Marketing Jack Norris described to me as a “nominal” hourly cost.

For users, the arrangement provides the best of both worlds: they can take advantage of MapR’s performance features and the MapR Control System while also having integration with AWS’s suite of cloud services just like all other Elastic MapReduce users.

For MapR, the arrangement is a coup as it tries to establish mindshare and market share against Cloudera, Hortonworks, IBM and even OEM partner EMC Greenplum. Amazon Elastic MapReduce is actually rather popular because it’s an entirely on-demand service that doesn’t require user to own their own Hadoop clusters, and many companies already store data within Amazon’s S3 storage service. Considering MapR’s growing reputation as a premium alternative to some of its all-open-source competitors, the partnership has to result in a significant uptick in users for MapR.

Of course, AWS wins in the arrangement, too. “They reached out to us,” Norris said, “in response to customers requiring these kind of enterprise-grade features.” In its latest release, also announced on Wednesday, MapR added a slew of new features, including support for multi-tenancy that lets users partition a cluster in order to run separate jobs simultaneously.

Hadoop is no longer just an open source Apache project, but a full-fledged enterprise IT market. Everywhere you turn, the players — especially at the distribution level that underpins higher-level frameworks and applications — are forming partnerships and otherwise looking for any opportunity they can find to carve out a slight advantage. Cloudera, Hortonworks and MapR all have some impressive deals in place, but now that everyone is officially pushing products, I suspect we’ve just seen the opening salvo of a long battle.

Feature image courtesy of Shutterstock user Sashkin.

  1. 20x faster is utterly absurd. .9 – 1.1X is more like it.

    Share
    1. Jon, MapR’s no-NameNode architecture leads to dramatic gains in random I/O performance. It will be much more than 20x in that dimension with a large cluster. For real-world workloads, comScore CTO presented today at the Hadoop Summit and reported about 3x on his workloads.

      Share

Comments have been disabled for this post