Blog Post

Google BigQuery now correlates your data

BigQuery, Google’s cloud service for fast queries over large volumes of structured data, now includes a function for determining the correlation between two variables. Upload some data, enter some SQL code, get back a Pearson correlation score (you know, the scale from -1.0 to 1.0 where -1.0 means a perfect negative correlation, 1.0 means a perfect positive correlation and zero means no correlation.)

Google’s Felipe Hoffa showed off the new capability in a blog post on Thursday, using data collected from the sensors the company had placed throughout San Francisco’s Moscone Center during the Google I/O conference in May. Here’s an example of a query trying to find correlations without much specific targeting of values:


And here is the result in a table:


Of course, as the feature image highlights, these tables can easily be turned into graphs so the naked eye can compare selected values.

Google BigQuery isn’t for everyone or every class of data — obviously — but it’s advancing at a pretty rapid pace since its initial release in May 2012. At the very least, it’s likely a lot faster, cheaper and easier than buying an analytic database system, and save for 1010data (which is a bit more business-focused), I’m not sure of too many similar services.

Barak Regev, head of Google’s enterprise-focused cloud business for EMEA, will be speaking about the company’s litany of cloud services at our Structure: Europe event Sept. 18 and 19 in London.

Structure Europe in article

2 Responses to “Google BigQuery now correlates your data”

  1. This is getting to be very, very interesting. I can foresee companies out there using this to create “correlated scorecards” of key metrics to help them to determine the most important factors to improving performance. It would kind of nice to see something like this versus the tomes of tangentially useful metrics being tracked by many organizations today.

    Thanks for the article.