Starting with the well-known quote — “A good way to predict the future is to invent it” — Ravi Murthy, engineering manager at Facebook, kicked off an interesting panel discussion at GigaOM Structure:Data 2013 Thursday with four industry experts on business intelligence (BI) and Hadoop. Hadoop has a big place in that future, but not by itself. The conclusion? Applications and SQL databases built atop Hadoop are needed for better BI, noted the panel.
“Why are so many systems being built in the BI landscape? If Hadoop can deliver the promise, why have all these other solutions?” asked Murthy.
Ashish Thusoo, co-Founder and CEO at Qubole, said that putting SQL on top of Hadoop just makes sense. “As a system, Hadoop is not a low-latency system, opening the need for faster SQL-based systems to query the data. And there’s probably only space for half-dozen of these solutions in the market; not dozens.”
Agreeing with Thusoo was Tomer Shiran, director, product management at MapR Technologies. “With our open source Apache Drill we’re enabling lots of differing BI use cases allowing companies to do different things with Hadoop. One use case is ability to interactively query and explore data.” Apache Drill is an interactive, low-latency SQL way to get at the data reservoir in Hadoop. Ben Werther, founder and CEO, Platfora completely agreed, saying that customers looking for much more agile approaches to data exploration without building more IT work.
But Hadoop is still an important underlying part of the puzzle. Justin Borgman, CEO, Hadapt noted that “Hadoop scales so cost effectively; it’s a landfill where you can dump everything. That opens up new opportunities to explore that data including indexing to boost performance and interactivity across a broader data set.”
When asked for a use case of the benefits, Werther pointed out an unnamed customer. “They had 50 analysts working against SQL stores in a very siloed fashion. We moved them to a Hadoop-based stack and built a data reservoir. Only 5 of the 50 were able to be productive before. Within a week, all 50 became productive.”
Of course, the cloud is also part of BI’s future, although it’s not without risks. Sure, running Hadoop in the cloud is very elastic so that you can use as many resources as you need in near real-time. But the issues of security and data gravity in particular are worth noting: Generating data in the cloud could make it tough to move out in the future and may require more apps build on this data to also be in the cloud.
Check out the rest of our Structure:Data 2013 live coverage here, and a video embed of the session follows below:
A transcription of the video follows on the next page