It was a big week for big data, with two key trends — free software and Oracle support — adding fuel to claims that data management and analysis will never be the same. With more and better free tools available, and with support for Oracle databases that aren’t likely going anywhere anytime soon, there can be no doubt that even laggards will be tempted to give big data tools a try to see what all the hype is about.
On the free-software front, most the action was around Hadoop. I detailed the biggest news earlier in the week, which was Yahoo deciding to cease development of its own Hadoop distribution and once again focus its efforts on Apache Hadoop. The intended result is to create the most full-featured and stable core Hadoop releases possible, making it even easier for Cloudera to harden and tweak Apache’s version for the enterprise-friendly Cloudera Distribution for Apache Hadoop. Karmasphere was also in the giving mood, announcing free quick-start bundles for Amazon’s Elastic MapReduce Hadoop service. Consisting of 30-day evaluation licenses of Karmasphere software as well as $25 in Amazon Web Services credits, Karmasphere hopes the bundles will show data professionals that developing Hadoop applications doesn’t have to be so difficult.
Outside of Hadoop, in the database world, EMC released a free, open source version of its Greenplum Database, and DataStax (formerly Riptano) released a management console and dashboard for the popular NoSQL database Cassandra. These are two important developments, albeit for different reasons. For EMC, the free Greenplum download proves two things: (1) EMC gets that big data software is a new market from its traditional storage systems market; and (2) EMC is willing to give Greenplum some room to lets its pre-acquisition strategy play out. EMC no doubt plans to make a lot of money from Greenplum — it already has launched a big-time data warehousing appliance, after all — but it has to get users on board and comfortable with the product first.
It’s a slightly different situation with DataStax, which needs to give prospective users a reason to pay the company instead of just using the already free and open-source Cassandra software. Its new OpsCenter management console has a free edition that might help attract customers interested in NoSQL but not willing to invest the time to learn to product inside and out. As Cassandra continues to improve under the watchful eye of Apache, and if OpsCenter really does make it that much easier to use and monitor, DataStax could end up converting a lot of paying customers.
On the other end of the spectrum, but arguably just as important, is the growing support for Oracle within new technologies. First, it was Amazon Web Services announcing an Oracle Database 11g version of its Relational Database Service, then it was new and improved tools by Quest Software for integrating Oracle database and Hadoop environments. As the saying goes, nobody ever got fired for buying Oracle, so even though nothing related to Oracle’s database will ever be free, Oracle support gives prudent database administrators the confidence to experiment with new technologies — cloud computing and Hadoop in this case — knowing it doesn’t mean giving up on a tried and true product.
When you throw informational events like O’Reilly’s Strata conference this week and our forthcoming Structure Big Data conference into the mix, it’s starting to look like the pieces are in places for Hadoop, NoSQL and other next-generation data tools to really catch on. When the problem is obvious, the barriers to entry for solutions are low, and there are smart people sharing strategies for maximizing those solutions, what is there to hold users back?
Related content from GigaOM Pro (sub req’d):