Hmmm…Software That Predicts If You Will Do Crime & Time


The Florida State Department of Juvenile Justice says it will use predictive analytics software from IBM (s ibm) to foretell which of its juvenile offenders are likely to return to crime. The software, made by the SPSS division that Big Blue purchased last year, will replace Excel spreadsheets analyzed by employees. The software can look at far more data inputs and potentially handle more juvenile offenders faster than the older methods, and presumably the ability to incorporate more data points could lead to better results. Those deemed likely to re-offend are given specialized treatment.

The UK Ministry of Justice also uses IBM’s predictive software on its criminal population, to see which ones pose a greater threat to public safety upon release. IBM clearly plans to take SPSS beyond its former domain of market researchers and scientists and apply it to where the big money is — homeland security in these frightening times.

Deepak Advani, vice president of predictive analytics at IBM, said, “Predictive analytics gives government organizations worldwide a highly-sophisticated and intelligent source to create safer communities by identifying, predicting, responding to and preventing criminal activities. It gives the criminal justice system the ability to draw upon the wealth of data available to detect patterns, make reliable projections and then take the appropriate action in real time to combat crime and protect citizens.”

Is anyone else getting “Minority Report” flashbacks? I’m a little concerned as we evaluate our laws protecting citizen and corporate electronic communications (GigaOM Pro sub req’d), that we now have the tools to establish a reliable and cheap surveillance society. With the scale and flexibility of cloud computing, better data management flows and the infrastructure to run many of these queries, governments and private companies are going to have the resources to predict not only market trends and supply chain needs, but also behavior. IBM actually plans to marry its SPSS software to a scaled-out architecture to offer a data-analytics cloud.

Combine good software and the cloud, and the scanning of older data for predictive analysis could soon start incorporating real-time data. Given that someone has already been arrested after making comments on his Twitter feed and the police regularly scour Facebook pages looking for suspects and threats, it’s not so far-fetched.

Image courtesy of Flickr user AlanCleaver_2000


May I buy a copy of this software? I’m trying to date men again, and want to reduce the probability of having to defend against a serious domestic violence type attack. While I’m more than strong enough to absorb what’s done to me, having an intimate partner hate me that intensely is really disappointing, and waiting for an attack to come but not knowing if it will gets really old really fast. On the whole my preference is to not be involved with the kind of man who wants to use me as a punching bag. So given that I have multiple men interested in dating me, it sounds like a good idea to run them all through such a piece of software and pick the ones who are least likely to spike my drink or grab me by the hair when I’m not expecting it. I rather dislike being attacked and I’m not a huge fan of that in general, but interacting with men who don’t hurt me on purpose is very nice. The problem is that it’s impossible to tell which is which just by looking at them or talking to them.


j4e6s9, thanks for the thoughtful comments, but I have to disagree. First, no one is advocating pre-emptive arrests or forced treatment of children deemed “likely to re-offend.” And no one is or should be claiming that this type of data crushing is anything but theoretical. Hence terms such as “likely to.” What this should do is examine risk factors in a community, try to pinpoint which risk factors children who are already involved in the criminal justice system have, and then recommend what kind of services, help, or treatment these children, and possibly their communities, should have available to them. It’s most likely effect will be to decrease the discrimination that minorities suffer based on the false assumption that if you are a minority (or male, or poor, etc.) you are much more likely to commit crimes. In our community we did a “spreadsheet” type analysis that showed that race and poverty level meant nothing compared to risk factors such as having a parent who was incarcerated or poor school attendance. This helped us to break down problems into smaller elements that can be more easily addressed. For instance, we increased efforts to encourage school attendance and dropped the number of chronic truants by 18% in one year. Some of these measures were very broad, but we also targeted individual truants and got even better results with them. And by targeting – I mean we helped them and their families and they now attend school more regularly.


“I’m a little concerned …”

Get very concerned. The idea of 1) using past and/or present acts and conditions (re records and self-report) to “predict” the “probability” of future crimes by felons (juvenile or adult) and 2) administering differential treatment based on that prediction are rotten to the core from both a scientific and legal standpoint.

(Non-mathematically) As every fledgling statistic student is taught – or should be taught – “predicting” beyond the “range” of the available data (here, in time, past to present attitudes and behavior) on which the “prediction” is based, is not accepted as a valid procedure in predictive statistics. Even within the range of that available data, there is an “error term” associated with it. That is, there is only the probability a behavior happening, not certainty. Predicting beyond that range for an individual involves the assumption that past behavior will continue into the future in a more-or-less linear fashion (an extremely dubious assumption with youth transitioning into adulthood, though somewhat less dubious with adult repeat offenders) which would produce a large error term. Added uncertainty of prediction would devolve from having to use the past behavior of others – not that individual. (there is no “three-strikes rule” for statisticians – one strike and they’re out – but in this case there appears to be first, extrapolation beyond the available data on an individual, second, assuming linear continuation of past behavior into the future, and third, using data from others to establish the error of prediction for the given individual in the future. He’s out, he’s out, he’s out.

If not apparent from the above, using such predictions as a basis for “differential treatment” stinks from a legal standpoint. In effect, an individual is being judged and treated for his anticipated future crimes rather than for deeds done and, worse, that judgment is being made on the basis of dubious assumptions and probabilities established by the behavior of others, not themselves.

Now, I’m all for protecting society from dangerous individuals and treating the incarcerated, to the extent possible, in order to rehabilitate them. However, this goes beyond the pale and appears to be nothing more than pseudo-science being used a cover for acting on poorly validated preconceptions and prejudices. One could do as well – within the law and rights of the individual – by steeply escalating the incarceration time for each successive crime. That is, get the repeat offenders off the streets based on their past and present behaviors, not their poorly predicted future behaviors – an injustice and potential threat to us all, and quit trying to parole them as soon as possible to save “costs” which, in any case, is a false economy.


You’re making a lot of assumptions about what this software will do and how it will do it that don’t appear in this article. For example you don’t know that it will predict “in a more or less-linear fashion” and that it doesn’t take account of changes in behavior of youth transitioning into adulthood. If may not, maybe it’s crap software, you can’t tell based on this article. You have no basis for those statements.

The numbers tattooed on Jewish wrists during WW2 were developed by IBM, just fyi.


Well, I think the point is that business is by definition amoral, and everyone has dirt on their hands. It was pretty obvious. Sorry you missed it.


Sounds like IBM is still doing the same sort of stuff they did in World War 2.


A differentiating aspect of IBM’s new forays into analytics is that they are actively developing algorithms and technologies for scale-out platforms. The current popular choices are a tradeoff between extreme scalability and analytical capability; the kinds of analytics a platform like Hadoop can do today are very limited because the computer science doesn’t support it. IBM is making deep analytics at very high scales possible which opens up a lot of markets that had largely been the domain of science fiction until now.

