A handful of new releases and new partnerships this week — as well as a big award — illustrate just how versatile the open-source data-processing tool Hadoop is and how widespread its use might become. We covered some of the news here — including Platform Computing’s plans for HPC-centric MapReduce middleware and Ravel’s goals of commercializing Apache Mahout — but that’s just the tip of the iceberg. Hadoop is becoming a more viable tool for everyone from business users to journalists.
One big advancement in the Hadoop world was the latest release of Apache Hive, the Facebook-created data warehouse and SQL interface for Hadoop MapReduce. Hive makes Hadoop usable for more-traditional database tasks, as well as by users not too familiar with the MapReduce language. As PCworld reported earlier this week, version 0.7.0 of Hive includes new enterprise-friendly features “such as indexing, concurrency and advances in authentication management.” Indexing makes it easier to lookup specific data, while concurrency helps to ensure that data remains consistent even while it’s being accessed or acted upon by multiple users.
Cloudera also expanded its partnership ranks, this time with intelligence-software provider Digital Reasoning. The companies are integrating Cloudera’s Distribution of Apache Hadoop (CDH) with Digital Reasoning’s Synthesys analytic software to create scalable, high-performance environments to store, process and analyze all types of intelligence data. The inclusion of HBase into the latest CDH release was a big driver for this integration, as it provides users with an analytic database option instead of just the Hadoop Distributed File System (although HBase does run atop HDFS). Digital Reasoning also has an integration partnership with DataStax (formerly Riptano) around the Cassandra NoSQL database.
But the biggest Hadoop endorsement of all might have come from the Guardian, which named Apache Hadoop as its Innovator of the Year at the MediaGuardian awards. The awards honor technologies and projects that are having or will have a profound effect on the media business. Beating out both the iPad and WikiLeaks, the judges described Hadoop and its data-processing versatility as “a Swiss army knife of the 21st century.” No doubt, Hadoop and its related technologies could have a big impact on how journalists and other media professionals approach using data to bolster their reporting.
My latest report on Hadoop (sub req’d) discusses its wide use across a variety of industries, and the advancements in the last week only underscore this point. As usage picks up and organizations get more comfortable the technology, we’ll be seeing a lot more innovative use cases coming out, and I think we’ll be surprised at some of the novel ways Hadoop is being used.
Image courtesy of Flickr user psd.