2 Comments

Summary:

Google has added another new capability to its BigQuery analytics service. This one lets users derive correlation values between similar data points, something Google highlighed using sensor data from its recent I/O conference.

tempmic

BigQuery, Google’s cloud service for fast queries over large volumes of structured data, now includes a function for determining the correlation between two variables. Upload some data, enter some SQL code, get back a Pearson correlation score (you know, the scale from -1.0 to 1.0 where -1.0 means a perfect negative correlation, 1.0 means a perfect positive correlation and zero means no correlation.)

Google’s Felipe Hoffa showed off the new capability in a blog post on Thursday, using data collected from the sensors the company had placed throughout San Francisco’s Moscone Center during the Google I/O conference in May. Here’s an example of a query trying to find correlations without much specific targeting of values:

sqlcode

And here is the result in a table:

sqlcorr

Of course, as the feature image highlights, these tables can easily be turned into graphs so the naked eye can compare selected values.

Google BigQuery isn’t for everyone or every class of data — obviously — but it’s advancing at a pretty rapid pace since its initial release in May 2012. At the very least, it’s likely a lot faster, cheaper and easier than buying an analytic database system, and save for 1010data (which is a bit more business-focused), I’m not sure of too many similar services.

Barak Regev, head of Google’s enterprise-focused cloud business for EMEA, will be speaking about the company’s litany of cloud services at our Structure: Europe event Sept. 18 and 19 in London.

Structure Europe in article

  1. This is getting to be very, very interesting. I can foresee companies out there using this to create “correlated scorecards” of key metrics to help them to determine the most important factors to improving performance. It would kind of nice to see something like this versus the tomes of tangentially useful metrics being tracked by many organizations today.

    Thanks for the article.

    Share
  2. Thanks for the write up! We also did this video on how to predict the future: With a 70 million flights dataset, we look for the best predictors of tomorrow’s flight delays. http://www.youtube.com/watch?v=tqS4vZ2Rxlo

    Share

Comments have been disabled for this post