Data scientists and business analysts have found lots of ways to get new hunches about what their employers should do by managing and processing data inside Hadoop — not to mention making a little money for themselves. This week the big-data news machine is running at full bore with the Hadoopers coming together in San Jose for the annual Hadoop Summit in San Jose.
New funding for Hortonworks on Tuesday was just a taste. Here are a few of the announcements suggesting that companies are willing to throw money at hardware and software vendors to bolster their use of Hadoop as they ramp up big-data activities:
Startup DataTorrent has taken on $8 million in Series A funding.
August Capital led the round, which brings to $8.75 million the total amount of venture funding DataTorrent has taken on so far. The company is seeking to capitalize on the hunger for more real-time Hadoop capability. Based in Santa Clara, Calif., DataTorrent is headed by Phu Hoang, a former executive vice president at Yahoo (s yhoo) who managed search and search monetization of Yahoo and oversaw the expansion of the Hadoop group, and Amol Kekre, who worked on a streaming platform at Yahoo Finance and contributed to the development of Hadoop 2.0.
In addition to the funding news, DataTorrent is announcing the availability of developer and evaluation versions of its streaming platform built on Hadoop. Rather than offer batch processing that Hadoop already makes possible, DataTorrent aims at real-time analysis and alerts via text, email and other methods, so companies can take action right away.
What good is that? Hoang, the company’s CEO, made it real with an example about data coming in on the performance of an online advertisement: “You are able to see live data on your margin, cost, revenue and click-through rate now in real time,” he said. “You don’t have to wait till eight hours later to find out you spent a bunch of money buying impressions that don’t work, and (you can) not buy those impressions that have low-click through rate or low margin.”
DataTorrent makes it easier to pull in data from multiple inputs with a library of more than 250 operators that the company is making available through an open-source Apache Software Foundation license.
A global telco, a global financial company and a major webscale company are using DataTorrent, Hoang said. With the new funding, Hoang is aiming to persuade many more companies of the value of analyzing and taking action on real-time data. It will have to prove more valuable than Storm and event-processing tools like StreamBase.
Splunk is letting customers do more analytics on data located in Hadoop clusters.
Splunk (s splk) first announced Hadoop integration in 2011 and came out with Hadoop Connect in October. Since then it’s listened to what customers want to do and come up with a new standalone product called Hunk so customers can get the most out of data already sitting in Hadoop, said Sanjay Mehta, vice president of product marketing at Splunk.
With Hunk, customers can query data already in Hadoop, get previews, adjust queries on the fly, create visualizations, generate reports and share findings with colleagues through Splunk. After a user runs a query, the results hang around so other users can get access more quickly, without having to wait for a job to finish. Hunk is in private beta, and so far companies have been using it to analyze click streams and spot potential security threats.
Teradata is announcing a bunch of new Hadoop products.
Partly thanks to a deeper relationship with Hortonworks, Teradata (s tdc) is offering the Hortonworks Data Platform software on Dell (s dell) commodity servers with integration into Teradata, or as standalone Hortonworks software with Teradata support. The company also is rolling out new big-data consulting services.
The Hadoop Summit is just getting under way, and Hadoop hype still abounds. We’ll see what news comes next.