Google is funding “an artificial intelligence for data science”

7 Comments

Google is funding a project called Automatic Statistician that bills itself as “an artificial intelligence for data science,” it announced Tuesday. The project, which comes out of the University of Cambridge and is still in its early stages, aims to automate the selection, building and explanation of machine learning models.

In a nutshell, Automatic Statistician works by looking at a dataset and then determining which type of model would be best for analyzing it as well as which features, or variables, are the strongest. After the model runs, Automatic Statistician will return a text report explaining its findings in plain English — or as close as you can get when dealing with statistics.

A snippet of an Automatic Statistician report on unemployment data.

A snippet of an Automatic Statistician report on unemployment data.

The project’s homepage quotes Google research scientist Kevin Murphy, who also wrote the blog post announcing Google’s funding for it, explaining the promise of Automatic Statistician like this:

However, Automatic Statistician isn’t the first attempt to deliver this type of service; there have, in fact, been multiple commercial attempts at doing similar things. The most accurate comparison might be to a now-defunct tool by machine learning startup Skytree called Skytree Adviser, which also automatically selected models and generated text reports of its findings. Startups including BeyondCore, Nutonian and even Ayasdi are all promising varying degrees of this functionality, as well.

As sexy as it is to talk about automating the data scientist job, though, it’s a bit early to suggest any software will eliminate the need for such employees any time soon. Even if projects like Automatic Statistician or commercial tools can make it possible for relative laypersons to run machine learning models and uncover patterns, that’s just a step or two down what’s often a much-longer path of turning insights into real value or, possibly, products.

7 Comments

Stats

So hard to become a Statistician in my country, all my efforts to nothing…
Please google don’t erase the Big data scientists jobs!!!!

Leland Wilkinson

I am the author of Skytree Adviser. I wrote the program prior to joining Skytree. It consists of about 75,000 lines of Java and 25,000 lines of XML. Adviser does not optimize a goodness-of-fit criterion or use Bayesian methods to do model selection. It is not a machine learning program. Instead, it uses the same heuristics statisticians use (residual diagnostics, etc.) and will occasionally fit models that have higher prediction error than a blind optimizer because the assumptions are met better by the selected model. For example, asking Adviser to predict a variable from a set of predictors may result in OLS, nonlinear regression, (multinomial) logistic regression, Poisson regression, zero-inflated Poisson, negative binomial regression, etc., depending on inferences Adviser makes concerning the data. Adviser uses robust statistical methods (biweights, least-median-of-squares, etc.) to test assumptions and will generate cautionary text when it thinks the assumptions are not met.

I joined Skytree last year and we planned to expand Adviser to the machine learning space using Skytree technology. I transferred Adviser to Skytree in exchange for the opportunity to launch it through their website. Adviser generated several important new bookings and considerable excitement.

Because Adviser was not a machine learning program and was not designed to handle Big Data, Skytree decided to remove Adviser from its website and discontinued its availability. That is why this article uses the word “defunct” to describe Adviser. It is not defunct, it is not beta; it is a production desktop application.

Leland Wilkinson
http://www.cs.uic.edu/~wilkinson

Steve Ardire

Use startup automated AI tools rather than Google for obvious reasons

Dan Hughes

Or typical Google smoke and mirrior distraction. Leading to nothing.

Alex

Not so much distraction, as measured risk-taking in investing in new technologies. It would be rather silly to expect world-shattering effects from every company or product that receives large investments from noteworthy investors. Why is this any different? What makes Google so special?

Comments are closed.