Blog Post

Why data should be our guiding light on public policy

With the advent of open data and new, powerful methods for analyzing it, we’re learning a lot that could challenge longstanding beliefs on public policy. Politicians, social workers and other civil servants have always had data, of course; they just never had as much and could never do with it what they can today. They should listen to what the computers tell them.

What’s possible

Recent HIV research from Brown University is a great example of what’s possible. Researchers formulated a computer model based on numerous factors relating to drug use, sexual activity and the medical aspects of HIV infection. To ensure it was accurate, they calibrated the model until it could accurately reproduce known HIV infection rates in New York City from 1992 until 2002. They ran the model thousands of times on a supercomputer.

Credit: Brandon Marshall/Brown University

They found that the rate of of HIV infection among New York City injection drug users will be 2.1 in 1,000 by 2040 if current programs are left in place. Expanding needle exchange programs will decrease that rate by 34 percent; expanding HIV testing would only result in a 12 percent reduction. However, a comprehensive approach that includes these two programs as well as two others regarding the administration of medicine and antiretroviral therapy would drop the rate by more than 60 percent to .8 per 1,000.

Assuming their model is accurate, that’s a significant reduction — getting HIV rates among drug injectors near zero — and it’s all thanks to access to lots of data and lots of computing power. Recently, another group of researchers in Europe developed a computer model that found a strong correlation between web censorship and high violence rates during times of social unrest — a timely finding given the current state of world affairs.

Last week, I explained how Xerox is working to help Los Angeles and other cities get a better view of their traffic so they can try to make life more efficient and less congested for citizens, while simultaneously reducing pollution and optimizing budgetary resources. To achieve these goals, Xerox and other companies in this space are gathering data from everywhere — cars, mass-transit systems, traffic sensors, cell phones, weather databases — and developing complex machine learning models to determine how everything is connected.

Of course, these are just a handful of examples of what researchers and others are working on with regard to data. Pick an area of public concern — climate change, smart grid, crime rates, genetics, whatever —  and you’ll find someone with mountains of data running some seriously complex algorithms to make sense of it.

Anyone can do it

However, as anyone who reads GigaOM regularly probably knows, decision-makers don’t need in-house supercomputers or data scientists on staff to inform their policies with data (although the latter wouldn’t be a bad idea). All they really need is an internet connection. Data sets are available everywhere you look, including at data marketplaces such as Factual and Infochimps, at, and even increasingly on news sites such as the Guardian (see disclosure). Thanks to cloud computing, the resources necessary to analyze this data are cheap and plentiful.

And with increasingly prevalent cloud services targeting low- to mid-level users who want to run some relatively simple analyses, there’s no excuse for politicians and others not to inform their decisions with — nay, base them on — data. Last week, with company at my house and two toddlers running around, I was able to sit down with my laptop and generate a predictive model for gun-related homicide rates using a service called BigML and data from the Guardian‘s Datablog. It’s nowhere near Brown’s model, but I was able to do it while sitting on my couch.

Lazy politicians need not even get their hands dirty with raw data because chances are some journalist or bureaucrat has already analyzed it for them. Data on gun ownership in the United States versus the rest of the world is everywhere this week, as is, already, data on the spike in gun sales after last week’s shootings in Colorado.

The Nevada state legislator I recently heard on the radio struggling to defend his proposed tax on junk food would have benefited from reading this study from the USDA. It’s the top result on Google when searching “junk food cheaper than healthy food.” There’s also this interesting study on the effectiveness of Mayor Bloomberg’s giant soda ban in New York.

Why we should listen to the data

Look at the state of the world right now. Droughts, deficits, civil wars, obesity epidemics. A skeptic would argue that the old methods of public policy decision-making, driven largely by political and economic concerns, haven’t worked out too well. Why not give data a chance to take the lead? In the wake of the great recession, smart businesses certainly have.

It’s a simple proposition: Choose an important issue, find relevant data on it, analyze the data (or trust someone else’s analysis), and go from there. It’s objective starting viewpoint about whether something might actually work, political pressures be damned. Who knows, a brave politician who plants a stake not on the left or the right, but with data analysis, might end up looking like a hero in the end.

Disclosure: Guardian News and Media Ltd., the parent company of the Guardian newspaper, is an investor in the parent company of this blog, Giga Omni Media.

Feature image courtesy of Shutterstock user MikeE.

10 Responses to “Why data should be our guiding light on public policy”

  1. Bradley Smith

    “And with increasingly prevalent cloud services targeting low- to mid-level users who want to run some relatively simple analyses, there’s no excuse for politicians and others not to inform their decisions with — nay, base them on — data” . Have a look at BusinessOptics which is an online modelling tool with embedded machine learning to enable anyone to leverage machine learning on their data without any programming skills and then integrate this into more complex visual forecasting and decision models. Have a look at for a video that demonstrates this.

  2. Derrick, thanks for posting this. Chicago now has a CTO and Chief Data officer purposefully to begin to allocate limited resources more effectively. Sure, headlines and public sentiment don’t tend to align as the first comments reflect, but that doesn’t stop others from trying to grab a little bandwidth to show and share different relationships.
    I like the open source model, for example the one used by Astronomers leveraging the collective intelligence and dedication of millions of amateur astronomers to supplement the academics efforts with limited time on the big telescopes(this was recently on NOVA .

    The question is how and among whom to share it, and whether speaking truth to power will bring about change?

  3. Anne Rynearson

    Much data at the city level is still only available by request and, even then, not in an open format. Better policy requires opening more data to use by analysts and developers. Code for America’s Open Impact campaign ( provides resources to residents and city leaders looking to bring such open data policies to their communities.

  4. Anne Rynearson

    Better policy requires opening more public data to public use. Much data at the city level is still only available through request and, even then, often not available in an open format that can be analyzed or used by developers. That’s why Code for America’s created the Open Impact campaign ( to provide resources for residents and city leaders to use when promoting open data and open gov in their communities.

  5. Anonymous

    I was put in charge of a Government “Evidence Based Policy” initiative to do just this. Sadly after a few months I learned that all the politicians really want is “Policy Based Evidence” – ie., the politician creates a policy that supports their party or donors, then turns to data analysis only to find a way of making the policy appear to be in the public interest.

    Until voters start voting on something deeper than headlines, sound-bytes, and per-packaged ideology I fear this won’t change.

    • Derrick Harris

      Thanks for the comment. I have no doubt that’s the case, and you see happen all the time during election season. Or just picking a data point and using it as evidence without any context.