Commercial Hadoop champion Cloudera is building a connector to enable movement of data between Netezza’s data warehousing appliance and Hadoop clusters built atop Cloudera’s Distribution for Hadoop (CDH). It’s the latest instance of an analytics vendor integrating Hadoop support (via a Cloudera partnership or otherwise) and further evidence that Hadoop has legs as a commercial technology for big data analysis.
Netezza already supported Hadoop within Netezza’s TwinFin appliance, but this partnership goes beyond support and aims to actually make the data movement and transformation process easier.
For data warehousing vendors, the decision to add Hadoop support is all about customer choice. Hadoop clusters are ideal for storing large volumes of unstructured data, processing it and making it ready for analysis, whereas appliances like TwinFin are limited in scale and focus on analyzing standard data types. When the two are combined – especially via specifically designed connectors like in this case – analyses can be carried out across all the data in the combined environment.
What’s a bit interesting about this partnership is that it’s with Netezza. I understand that Cloudera and analytics database vendor Greenplum were working together, but there’s no telling how the EMC acquisition affected that work. Given the depth of this partnership – technology, sales and support – it’s possible Cloudera has all but settled on Netezza as its data warehousing sidekick for the time being.
Of course, it’s neither the first nor the last time we’ll see Cloudera – much less Hadoop, in general – involved in some type of integration efforts. As organization of all types are bombarded by Big Data, business intelligence, database and data warehousing vendors all realize that Hadoop support is becoming a must-have, and it seems safe to say that Hadoop has finally made the journey from search engines to mainstream businesses.
Photo courtesy of Flickr user Elizabeth Ann Collette.