Earlier this week, GigaOM’s Sebastian Rupley detailed Cloudera’s plans to move beyond open-source software and services into the world of proprietary software. It might seem a bit early for the startup dubbed by some as “Red Hat for Hadoop” to abandon its strictly open-source approach, but a look at the Hadoop market shows now might be the ideal time for Cloudera to establish itself with an enterprise-ready offering.
All things NoSQL are hot, and businesses across industry boundaries are searching for ways to leverage such technologies without investing the resources to master any open-source implementations. Take the example of Hashrocket, for example. The web development firm helped a pharmaceutical client incorporate MongoDB into an existing SQL-based application, resulting in drastic improvements in query performance. Next month, MongoDB sponsor 10gen will host the NoSQL Live event to shed light on these types of real-world use cases.
Cloudera itself is seeing similar demand for Hadoop. As CEO Mike Olson told Rupley, “We see people interested in it for crunching genomics data, retailers and financial institutions interested in it for processing large sets of transactions, and interest from the health-care and energy industries.” And Cloudera isn’t the only one working to move Hadoop out of the web-search world. A group of Ivy League researchers is working on a project called HadoopDB that aims to meld Hadoop with traditional database management capabilities. This follows in the footsteps of other projects aiming to bring together Hadoop and SQL, including Yahoo’s Pig programming language. IBM is pushing Hadoop via its M2analytics appliance.
Of course, there always will be a place for Hadoop in the web, too (even if some far-forward-thinking companies are looking beyond it in their attempts to take web personalization to the next level). For the time being, Hadoop and similar parallel-processing tools are about the best things going for managing and analyzing the mountains of data users are generating. From e-commerce applications to plain, old site-data analysis, there are plenty of reasons for businesses to consider a robust Hadoop solution.
Oh, and anyone assessing the wisdom of Cloudera’s plans to go proprietary would be remiss to ignore Red Hat itself. The company that all but invented the open-source-plus-services model is working on a slew of cloud-computing-focused products, including something called Cloud Filesystem. Considering that Hadoop is not a database but a distributed file system, it might be in Cloudera’s best interests not to compete directly against Red Hat in the business model the latter has mastered.
Like the companies from which its management team emerged, Cloudera understands the concept of supporting open source without giving away the farm. If it thinks proprietary products are a good idea at this point, I’m going defer judgment. Besides, there’s no use trying to be the Red Hat of Hadoop if Red Hat is trying to keep that title for itself.