Blog

Today in Cloud

Initially developed inside Yahoo! as a MapReduce-inspired tool for churning through Big Data, Hadoop was open sourced and continues to thrive within the Apache community as a key weapon in the data scientist’s toolkit. Well-funded startup Cloudera took the open source code (and key project contributors) and is building a business around helping enterprises to deploy and benefit from Big Data analysis. Today, Cloudera announced General Availability for CDH3; a fully open source distribution including the Hadoop Distributed File System (HDFS), Hadoop MapReduce, and a collection of tightly coupled companion tools designed to ensure that, “right out of the box, you can get useful work done on Hadoop.” Like other Big Data tools, Hadoop has tended to be rather rough around the edges; brilliant at churning through data in a particular way, but less polished when it came to interfacing with other systems or extending to cover a wider set of enterprise data analysis tasks. The work that Cloudera continues to do in packaging Hadoop’s power in a more accessible form has been important in making Big Data accessible to a broader audience. CDH3 takes this easy integration to a new level, whilst also raising the bar for Cloudera; the code is all open source and available to their competitors, forcing the company to continually differentiate itself on the service it offers rather than the code it controls.