In the afterglow of GigaOM’s Structure:Data conference this week, a few big-picture trends and surprising quotes stuck with us. (Check out our live coverage here.)
Data needs people, my friend
Despite the much-discussed power of data, there are roles for people to play in big data projects. Data increasingly influences companies’ decision making processes, but several speakers hit on the notion that people should be involved in big data storage and analysis.
It all starts with a human question. Before machines generate answers, employees from many departments should feel empowered to ask good questions of data, said John Sotham, vice president of finance at BuildDirect.
Beyond questions, humans need to decide which algorithms to employ and which data to use to answer questions, said Scott Brave, founder and chief technology officer of Baynote.
In data science, machine use algorithms to make decisions with clean data for the sake of prediction and optimization, said Sean Gourley, chief technology officer of Quid. But in “data intelligence,” humans “create, change and shape the world we’re in” using small sets of messy data, he explained.
Sometimes algorithms don’t bring the best results as well as people can. One website crowdsources identification of the top news to people, as my colleague Kevin Tofel wrote. And at times, it’s wise to throw lots of people at big data challenges. With TopCoder, there are competitions to discover the best software architecture, algorithms and analytics, said the company’s chief technology officer, Mike Lydon.
There was an exception to the man-and-machine rule. The software BeyondCore’s software makes machines crunch all available variables to isolate the biggest profit generators. It displays charts and audibly tells you its findings.
It takes leadership
Becoming a data-driven company requires a human push, said Paul Maritz, chief strategist at EMC. “Change requires leadership,” he said. “It requires people to understand what is happening and really get behind it and drive organizations to transform, because none of us really like to change,” he said. Only then can companies discover better ways to make money.
Meanwhile, Amaya Souarez, director of data center services at Microsoft, said that lots of internal data doesn’t automatically affect changes in strategy. “The data will help you in your discussions, but it’s not everything,” she said. “It really does take a lot of personal interaction and commitment to that relationship,” she said.
We want analytics and we want it now
Whether in Hadoop or in specialized databases, our speakers showed why they want to see big data analytics to happen in real time.
Muddu Sudhakar, vice president and general manager of the Pivotal Initiative’s Cetas cloud and big data analytics platform, called for “Hadoop high throughput, low latency.” And SQLstream CEO Damian Black said that 2013 “seems to be the year where it’s all happening now. All Hadoop distributions are talking about streaming technology.”
Ashok Srivastava, chief data scientist at Verizon, talked about what machines could do if they process data in real time: go through millions of new pictures users make on their cell phones and predict the health of a person or a machine based on changes over time. Similarly, Maritz identified an opportunity telecommunications companies have yet to take advantage of: texting customers to apologize for a dropped call. “They can’t even do that today, let alone do more ambitious things on top of that,” Maritz said.
Big data words to the wise
Executives, IT administrators and others will likely discuss these themes in the coming months. A few statements from speakers also stand out:
“…What’s really most intriguing is that you can be 100 percent guaranteed to be identified by simply your gait — how you walk.” — Ira “Gus” Hunt, chief technology officer of the CIA, in a statement on the capabilities of a three-axis accelerometer
“Hadoop is hard — let’s make no bones about it. It’s damn hard to use. It’s low-level infrastructure software, and most people out there are not used to using low-level infrastructure software.” — Todd Papaioannou, founder and CEO of Continuuity, in a statement on his lessons from Yahoo, where he was chief cloud architect
– “I get asked all the time to explain, How is Riak better than Hadoop?” — Justin Sheehy, chief technology officer of Basho Technologies, in a statement about how hype surrounding Hadoop and big data gets in the way of real discussion about solving data problems
– “What if you could send your sperm over email to somebody else and print the sperm on the other end?” — Naveen Jain, founder and CEO of Inome, in a statement about disruptions in big data from other industries