Weekly Update

Big data is real, but only when you ask the right questions

Alistair Croll’s post on O’Reilly Radar this week may have been titled “There’s no such thing as big data,” but Croll believes the exact opposite. Rather than just collecting data, he argues, you have to ask the right questions of it. Coincidentally, DJ Patil joined Greylock Partners this week as its first data scientist in residence, working with portfolio companies on making better use of data. And Factual released a new API, simplifying the task of working with place data from multiple sources. Together, these illustrate the broader shift that is slowly under way, a shift from simply collecting big data toward gathering and using data in order to make businesses better informed.

Croll’s post discusses the experience of a frequent flier in the premium cabins of United Airlines, and he asks why United did not quickly get in touch when this traveler stopped flying with it. Spotting and reacting to these changes in behavior should be exactly what decent data analysis would permit, but for some reason United didn’t do either of these things. It appears to be missing an opportunity to gain competitive advantage by understanding the behavior of its customers and responding accordingly.

Patil could certainly help United out here, and venture capital firm Greylock Partners seems keen to ensure that its portfolio companies don’t miss such obvious uses of the data they horde. Discussing Patil’s appointment, Greylock partner Reid Hoffman wrote, “Our companies have strong appetites to learn more ways to leverage data as a competitive tool.” Greylock is not alone in spotting the need to understand and gain value from the data being generated by the companies it funds. IA Ventures took a similar step earlier in the summer, appointing Drew Conway as scientist in residence.

Greylock is acting to ensure that its portfolio companies extract as much value as they can from the data they hold. Hoffman and others recognize that timely and effective analysis of data can offer real competitive advantage, even in mature markets like retail. Within Greylock’s portfolio, companies such as Zipcar might analyze data to ensure that its cars are parked in optimal locations, and Cloudera could tailor the products and advice it offers in order to make it even easier for its customers to work with data at scale. Over at Factual, the company is continuing to put the pieces in place to combine data from different sources. With the new Crosswalk API, developers can easily relate U.S. places identified in third-party services such as Foursquare, OpenTable and Yelp.

Factual and other similar data providers have the potential to greatly diminish one of the final barriers to entry for many innovative startups: their need to build large databases and populate them with relevant data that is not the focus of their business. Foursquare, for example, is about users’ sharing its check-ins at locations like Starbucks, not determining where every Starbucks is located. Building and maintaining that database, though, is a necessary (and high) cost of doing business; Factual removes the need for every new Foursquare to build that database from scratch. There are similar opportunities in areas from product data (how many companies store undifferentiating data about the features of the latest televisions or microwaves?) to company details and historical sales figures.

Companies like United Airlines (and, as I discussed recently, those in the manufacturing sector) are sitting on a gold mine. The timely analysis of data they already own aids customer retention, enables fine-tuning of routes and pricing, and provides insights that might be the difference between profitable growth and going to the wall. Factual is an early example of a company seeking to make data created by other people more usable. Consistent APIs and crosswalks that join silos of incomplete data are critical in preventing the endless re-collection of the same basic facts, freeing both new and old companies to concentrate on creating value rather than rebuilding foundations. DJ Patil, and those like him, bridge the gap between the two, transforming the way traditional companies like United use data and creating opportunities for newcomers like Factual to capitalize upon. Web 3.0, perhaps?

Question of the week

Will data scientists change the ways their employers work with and use big data?