The Guardian’s data journalism is cool, but it can take three weeks to make


Credit: Jordan Novet

This headline and body of this story were corrected at 11:10 p.m. with a more accurate description for the typical period of time for the deployment of Guardian journalist Feilding Cage’s data visualizations. Also, Guardian Datablog Editor Simon Rogers was incorrectly described as Cage’s boss, and that reference has been removed.

Once he finds a suitable topic, Feilding Cage, a New York-based developer and journalist for The Guardian, can easily spend three weeks generating the source information and designing a visualization for what’s become known as data journalism. The results bring understanding and reader engagement to topics that are otherwise discussed with a lot of words or static numbers. Readers can and do play around with the information, share it widely and discuss it for long periods after it appears online.

The Guardian's interactive guide to gay rights in the United States

The Guardian’s interactive guide to gay rights in the United States

Guardian pop up in the news plain old interesting The Guardian Datablog Data Store

Along with The Guardian, a few other news organizations have been putting an emphasis on data-driven reporting and visualizations, apps and even games in the past few years, such as the Chicago Tribune, the Los Angeles Times and ProPublica (Check out the Data Journalism Handbook for more information on this sort of work.)

Data journalism and visualization stand out for the verification and occasional gray-area explanations that journalists provide. Cage, for example, accompanied his interactive visualization of gay rights in the United States with a blog post explaining his methodology and disclosing his assumptions.

Screenshot from the Zoomdata's big data analytics iPad app

Screenshot from the Zoomdata’s big data analytics iPad app

covered last year

It would be neat to find a happy medium for enterprises that want original insights that every employee can see and use and act on but doesn’t take three weeks to generate. That’s especially true because the return on investment for work like Cage’s is hard to identify, although it’s possible the content could indirectly generate revenue by driving users to content they have to pay for.

Bridging the gap might be a matter of finding the perfect data scientist for the company. Or it might be a matter of time before the kind of work Cage does is automated. A computer already can write an earnings story, although it might be a few years before computers put wordsmiths out of business.

Maybe it just doesn’t make sense to cross data journalism visualizations with big data analytics apps. But I, for one, would like to play with such a tool.

Entrepreneurs from companies that work with and make visualizations from big data, such as Quid, will speak at the GigaOM Structure:Data conference on March 20-21 in New York.

Disclosure: The Guardian is an investor in Giga Omni Media, which publishes GigaOM.


BI expert

See this GigaOm article for something that solves this problem: The animated briefing is of course not identical to the infographics on Guardian, but serve the same purpose of making analysis engaging. The automated analysis of arbitrary data takes minutes rather than 3 weeks. This of course still doesn’t address jonathanstray’s point that the data first needs to be available before analysis can be done.


Data journalism and analytics often operate in very different contexts.

First, just getting the data is often a major challenge for journalists. I bet there was no database of gay rights legislation neatly broken down by state and category that Feilding could use. In other cases the data has to acquired from governmens by FOIA request, or the journalist has to convince a commerical provider to make it available.

Then, because the type of data changes with every story, it is not possible to use the same visualization technique each time. It seems unlikely that a single app will cover all of the different scenarios encountered in journalism.

Journalism also has an emphasis on publication and communication. The “user” for journalism usually has no background knowledge or special expertise, while an in house analyst might be very skilled and familiar with how to interpret the company’s specific data. Communication with a naive audience, on multiple platforms, is a major design challenge.

Finally, in journalism there is almost never an ROI for any given story. The production of journalism has always been heavily subsidized, e.g. by advertisers, or in The Guardian’s case by their U.K. trust. Journalists aren’t in it for the cash, but for the effects on society.

Comments are closed.