There has been some talk lately that big data is a bubble — that it has been overhyped, oversold and over-invested. I don’t think that’s the case, but don’t take my word for it. Come to Structure: Europe in London next month (Sept. 18 and 19) and ask some guys who know data stone-cold what they think.
We have two panels that feature, quite frankly, some of smartest data thinkers and technologists that I’ve met in the past year. More importantly, they’re guys who have built some flat-out impressive systems, done some amazing analysis and built really effective predictive models at major businesses and government agencies. That’s critical: these are people who understand what’s possible with big data technologies, but also how to actually get them working to improve the bottom line.
Who’ll be there?
Raymie Stata, the former Yahoo CTO responsible for bringing Hadoop into the company and nurturing its expansion within the business. Stata has a long history in the search-engine world, and is now founder and CEO of a Hadoop startup called Altiscale.
Adam Fuchs, the co-founder and CTO of Sqrrl, which sells a commercial version of the Hadoop-based Accumulo database. Accumulo is now an open source project, but Fuchs helped build it while working for the National Security Agency, where it’s the primary datastore and analysis engine for the god-knows-how-much data the agency is collecting.
Bhaskar Gosh, LinkedIn’s senior director of data engineering, who is overseeing some of the biggest infrastructural changes in the company’s history. LinkedIn is transitioning from essentially an online résumé repository into a major social network and media entity, and the databases and infrastructural tools Ghosh’s team is building are helping to enable that evolution.
John Foreman, the chief data scientist at MailChimp, who has almost singlehandedly taught the email-management service how to make the best-possible use of all the data it’s collecting. Internally, MailChimp has used machine learning to automate processes such as detecting spammers among its clientele. Externally, it’s now providing users deeper insights into their subscribers’ interests and behavior, and even giving them tools to do their own analysis.
Sam Hamilton, vice president of data technology at PayPal, a company that knows a thing or two about using big data. Hamilton is responsible for the systems that run PayPal’s data science efforts, which are centered around finding out who the people paying with PayPal are and what they’re interested in. PayPal and parent company eBay (Hamilton was previously CTO at eBay site Shopping.com) run expansive big data environments.
Ron Bodkin, founder and CEO of big data consultancy Think Big Analytics, whose client base includes companies such as NASDAQ, Johnson & Johnson and NetApp. Think Big’s customers often deploy more down-to-earth and traditional big data environments, if there is such a thing, but Bodkin has also served as vice president of engineering at Quantcast, which is one of the most-demanding Hadoop users around.
And those are just the panels. I’ll also be doing a chat with Microsoft’s cloud CTO Dave Campbell (who was a guest on our Structure Show podcast recently), and we’ll have speakers from CERN and the European Space Agency talking about building systems that can handle the incredible volumes of data they’re generating. Kleiner Perkins Caulfield & Byers Partner Michael Abbott (former vice president of engineering at Twitter) and North Bridge Venture Partners Partner Jonathan Heiliger (former vice president of engineering at Facebook) will be speaking about the types of new applications we’ll see thanks to the parallel convergence of big data and cloud computing technologies.
My hope is that Structure: Europe attendees won’t just hear that big data is great. That’s bubble talk. My hope is they’ll hear what’s possible and, better yet, how to actually do it.