An MLB team is apparently doing in-game graph analysis


According to a post in the Economist‘s Science and Technology blog, at least one Major League Baseball team has purchased a Cray Urika graph-processing appliance, apparently with goal of analyzing game data in real time to help the manager make decisions. This would be a pretty big step in the world of sports analytics, although not entirely surprising.

As Gigaom has covered a couple times over the past two months, as recently as last week, the world of sports in undergoing a transformation because of all the data now being captured about each game and each player. The now cliched Moneyball use case of analyzing players’ statistics to build better rosters has morphed into analyzing data about downs and distances in football to tracking, literally, every movement of every player throughout entire basketball games.

Until now, however, we haven’t heard a whole lot about that data actually affecting in-game decisions — it’s still early days for really big data analysis in sports, and risk-averse managers and coaches don’t want to be the ones betting the future on what a machine thinks.

Still, if something like this is going to happen, it only makes sense that it would happen in baseball. Baseball has always been a stats-heavy game, and you can’t listen to a game broadcast without hearing even the decidedly not-statistician announcers talk about someone’s batting average against left-handers or E.R.A. with runners in scoring position in order to justify a player substitution. In baseball, perhaps because of its slow pace and largely individual nature (i.e., the only two players that really matter at any given time are the pitcher and batter), teams pay a lot of attention to situations.

A type of machine learning, graph analysis — which powers things like ranking algorithms at Google and Facebook, online recommendations and even artificial intelligence systems for medical diagnosis — just takes these simple calculations to a much higher scale. Decisions that used to take into account just a couple factors could now take into account thousands of factors, even those such as weather or noise levels that don’t appear in the box score. Because baseball statisticians now track so much more stuff, it’s conceivable a team doing in-game graph processing could use data to influence new facets of the game that haven’t previously been tied to data.

We’ll cover a couple of interesting intersections of sports and data at our Structure Data conference next week in New York, with talks from McLaren Applied Technologies’ Geoff McGrath and Krossover’s Vasu Kulkarni. Moreover, attendees will also get a sense how what’s happening in sports is actually happening everywhere: It’s easier than every to measure anything we want, and it’s easier than ever to analyze that data, which means no field of human endeavor is safe from the effects of big data.

Feature image courtesy of Shutterstock user Bill Florence.

Comments are closed.