3 Comments

Summary:

Media outlets such as the Guardian take a long time to produce data-backed reports and visualizations, while big data analytics apps move fast but don’t lack a human touch. Is there a happy medium?

photo
photo: Jordan Novet

This headline and body of this story were corrected at 11:10 p.m. with a more accurate description for the typical period of time for the deployment of Guardian journalist Feilding Cage’s data visualizations. Also, Guardian Datablog Editor Simon Rogers was incorrectly described as Cage’s boss, and that reference has been removed.

Once he finds a suitable topic, Feilding Cage, a New York-based developer and journalist for The Guardian, can easily spend three weeks generating the source information and designing a visualization for what’s become known as data journalism. The results bring understanding and reader engagement to topics that are otherwise discussed with a lot of words or static numbers. Readers can and do play around with the information, share it widely and discuss it for long periods after it appears online.

The Guardian's interactive guide to gay rights in the United States

<em>The Guardian’s</em> interactive guide to gay rights in the United States

Cage is one of a handful of Guardian journalists who generate reports that say new things about topics that pop up in the news or are just plain old interesting. Cage and Simon Rogers, editor of The Guardian Datablog and Data Store, spoke about their work at the Strata conference at Santa Clara, Calif., on Tuesday.

Along with The Guardian, a few other news organizations have been putting an emphasis on data-driven reporting and visualizations, apps and even games in the past few years, such as the Chicago Tribune, the Los Angeles Times and ProPublica (Check out the Data Journalism Handbook for more information on this sort of work.)

Data journalism and visualization stand out for the verification and occasional gray-area explanations that journalists provide. Cage, for example, accompanied his interactive visualization of gay rights in the United States with a blog post explaining his methodology and disclosing his assumptions.

Screenshot from the Zoomdata's big data analytics iPad app

Screenshot from the Zoomdata’s big data analytics iPad app

It’s certainly one way to say something fresh with data, but it’s time-consuming when you consider big data analytics apps that provide users with real-time information users can compare against Hadoop-processed historical data, such as Zoomdata. (That company, which my colleague Derrick Harris covered last year, released the beta version of its iPad app on Tuesday.)

It would be neat to find a happy medium for enterprises that want original insights that every employee can see and use and act on but doesn’t take three weeks to generate. That’s especially true because the return on investment for work like Cage’s is hard to identify, although it’s possible the content could indirectly generate revenue by driving users to content they have to pay for.

Bridging the gap might be a matter of finding the perfect data scientist for the company. Or it might be a matter of time before the kind of work Cage does is automated. A computer already can write an earnings story, although it might be a few years before computers put wordsmiths out of business.

Maybe it just doesn’t make sense to cross data journalism visualizations with big data analytics apps. But I, for one, would like to play with such a tool.

Entrepreneurs from companies that work with and make visualizations from big data, such as Quid, will speak at the GigaOM Structure:Data conference on March 20-21 in New York.

Disclosure: The Guardian is an investor in Giga Omni Media, which publishes GigaOM.

  1. data journalism would require “computer science” to realize that it first need a shared bar code style feed.
    http://iiscn.wordpress.com/about/

  2. Data journalism and analytics often operate in very different contexts.

    First, just getting the data is often a major challenge for journalists. I bet there was no database of gay rights legislation neatly broken down by state and category that Feilding could use. In other cases the data has to acquired from governmens by FOIA request, or the journalist has to convince a commerical provider to make it available.

    Then, because the type of data changes with every story, it is not possible to use the same visualization technique each time. It seems unlikely that a single app will cover all of the different scenarios encountered in journalism.

    Journalism also has an emphasis on publication and communication. The “user” for journalism usually has no background knowledge or special expertise, while an in house analyst might be very skilled and familiar with how to interpret the company’s specific data. Communication with a naive audience, on multiple platforms, is a major design challenge.

    Finally, in journalism there is almost never an ROI for any given story. The production of journalism has always been heavily subsidized, e.g. by advertisers, or in The Guardian’s case by their U.K. trust. Journalists aren’t in it for the cash, but for the effects on society.

  3. See this GigaOm article for something that solves this problem: http://gigaom.com/2012/11/20/a-startup-asks-what-if-you-didnt-have-to-analyze-data-at-all/. The animated briefing is of course not identical to the infographics on Guardian, but serve the same purpose of making analysis engaging. The automated analysis of arbitrary data takes minutes rather than 3 weeks. This of course still doesn’t address jonathanstray’s point that the data first needs to be available before analysis can be done.

Comments have been disabled for this post