Being able to process and understand big data has enabled new business models to emerge. But for the next generation of applications to flourish, that data will need to be accessible even to folks who aren’t necessarily data scientists.
“Everyone has grown up with Google, but Groupon and companies like it are building businesses using big data with ease,” Luke Lonergan, founder of Greenplum and CTO, said at GigaOM’s Structure Big Data Conference in New York City today. That shift has occurred as the fundamental infrastructure has changed from a “scale-up” to a “scale-out” model for collecting data, Lonergan said. And, of course, the realization that the amount of data that’s actually valuable has increased.
Getting to data more quickly and making it available beyond the data science realm will help spur the next generation of big data applications. Lonergan gave the example of one of his employees, who was able to index all of Wikipedia in less than 30 seconds. “When you create [big data applications] quickly enough, you can create a real change in connectivity,” he said.
But big data shouldn’t just be for eggheads. “There is nothing wrong with eggheads, they’re great… but we need to democratize access to data products,” Lonergren said. By doing so, even folks who aren’t data scientists will be able to create powerful applications.
At the same time, there’s a need to balance the desire to build real-time applications with an understanding of the underlying data to make sure that when that data is analyzed, it remains intact. “What’s happening now is that insights happen earlier from unstructured sources,” Lonergren said. “But if you try to structure it before you get all the data, you lose some insight.”