7 Comments

Summary:

A pair of New York Times researchers have been poring over the newspaper’s data, looking for a way to understand the way influence plays out online. The work shows how organizations are looking to mine their data to find ways to improve their operations.

photo (1)

High up on the 28th floor of the New York Times, a pair of researchers have been poring over the newspaper’s data, looking to understand the way influence plays out online. What Mark Hansen, a UCLA statistics professor on sabbatical, and Jer Thorp, a data artist in residence at the Times, have found is that stories take on a life of their own, which can be mapped and visualized in some startlingly beautiful ways. The work, still “crazy” preliminary, shows how organizations are looking to mine their data to find ways to improve their operations. And it also shows the challenges that lay ahead in trying to turn the data into clear actions.

Hansen and Thorp, who talked at a TimesOnline TimesOpen event last night, took two weeks of August data from the paper, looking at how stories were shared through the Times’ site, Bit.ly and Twitter. The pair built a tool that allowed them to see the life of a story, from where it first began as a URL tweeted by the Times to being retweeted and shared again and again. The tool can render a simple timeline, a wheel with spokes or a radar view showing spikes of tweets. But it can also go 3-D, creating a funnel that expands over time as stories keep getting shared.

By visualizing the data, Hansen and Thorp were able to isolate “cascades,” a chain of events that extend the life of a story, and can identify who has the influence online to keep it going. For example, a column by Paul Krugman inspired modest sharing but took off when Tim O’Reilly, founder of O’Reilly Media, retweeted it. In other cases, like the story of the flight attendant who escaped down the plane’s slide, the cascades are more dynamic and complicated.

While it’s still quite early, Hansen said the next steps will be to make the project handle both real-time and archived information. The hope is that the Times can suss out which factors can affect a story’s life, whether it’s the section it’s in or the time it’s released. But this is where the tough part begins. It’s not enough to get the data; now the paper has to ask the right questions of it. As Michael Driscoll, founder of Dataspora and co-founder of Metamarkets (see disclosure below) said in a previous story, analytics is the key to tapping the potential of big data. The ingesting and visualization of data are critical elements but analysis is where companies make their money.

Think of using data as a three-step process. One has to have the data, then ask the data the right questions, then act upon the information. But with more data available to people, the number of questions that can be asked expand. It’s kind of like suddenly going from photographs to moving pictures. There’s more information for our brains to process, which makes the experience richer. Now, with more data and cheaper, more powerful computing, our metrics can move from a still snapshot in time to a moving picture of business health and activity. But where in that moving picture should businesses look? People will have to rethink the metrics they use in the snapshot era and find new focal points for the moving picture era of data. That act of finding out what new questions to ask will help separate the winners and losers, not mere analytics.

That’s the challenge for the New York Times, which, like many traditional media companies, is trying to revive revenues as their core audience shifts to digital. It’s great for the Times to have data to look at, especially identifying and perhaps eventually targeting key influencers who can make the paper relevant in the Twitterverse. But it has to quickly take the next step and turn that data and all those beautiful charts into business decisions that can affect the bottom line.

DisclosureMetamarkets is backed by True Ventures, a venture capital firm that is an investor in the parent company of this blog, Giga Omni Media. Om Malik, founder of Giga Omni Media, is also a venture partner at True.

Related content from GigaOM Pro (subscription req’d):

You’re subscribed! If you like, you can update your settings

  1. And it’s not just NYT Times doing this….

    sense-making on streams ( from Jeff Jonas who gave best preso at DeFrag this past week )
    1) Evaluate new information against previous information … as it arrives.
    2) Determine if what is being observing is relevant.
    3) Deliver this relevant, actionable insight fast enough to do something about it … as it’s happening.
    4) Do this with sufficient accuracy and scale

    1. Great stuff from Jonas who knows his stuff.

  2. most companies will fail at this because

    1. their data is scattered all over the place in different formats

    2. their analytical capabilities are seriously lacking, most “analyst” doing basic grunt work, turning out canned reports or just telling the CEO what he want to hear

    3. most “analyst” don’t have the freedom to go looking for opportunities in the data, more often some exec will say “this is what I believe, go out and prove my view point”, insights should come from data not vice versa

    4. very few executives are either brave or stupid enough to act upon insights that goes against their beliefs, especially if it proves them wrong in the first place

    5. most companies do not have an effective measuring, testing or post mortem procedure for action on said analysis, theories are all great and dandy, but until you can demonstrate actual cause and effect, its all blueberries

    6. most companies will sweep failures under the rug instead of doing rigorous analysis, as people leave and new people come in, these failures will be repeated

    exception companies deal and overcome the above

    data and analysis is only as effective as the people and processes you use and the effective actions you can generate

    1. Well said. Analysis is hard work and only getting harder.

  3. Don’t look for the right questions. Ask them all. No question of data is a bad question.

    Key is to have an end game. What is the desired outcome (the win). Line up all the questions and let them race (in real-time) to the finish line.

    No one knows the winner before they start the race. Let all the questions try to “qualify.”

  4. Just a minor point – its “TimesOpen”, not “TimesOnline”

  5. TimesOpen 2.0: Big Data Wrap-Up – NYTimes.com Wednesday, November 24, 2010

    [...] You can read more about Jer and Mark’s work on GigaOM: New York Times Looks for Answers in Data. [...]

Comments have been disabled for this post