Blog Post

The Crowdsourcing Challenge: How Sites Are Handling The Sarah Palin E-Mails

Today at 9 a.m. Alaska *Time* (1 PM EST), the Alaska state government released some 24,199 e-mails between Sarah Palin, Todd Palin and Alaska public officials from Palin’s days as governor — the emails were on paper and in boxes, still needing to be scanned. Sites across the web, from the New York Times (NYSE: NYT) and the Washington Post (NYSE: WPO) to Mother Jones and *MSNBC*, swung into action. Details from the e-mails hit Twitter within minutes and the first batch was scanned within half an hour. Already, we’ve learned that the Palins wanted to install a tanning bed in the governor’s mansion in Juneau.

But there’s something else — beyond the salacious nuggets — that’s notable about the Palin e-mails.

News organizations have plenty of opportunities these days to share massive amounts of data and information with their online readers, thanks in part to efforts like WikiLeaks. But they also have to figure out how to present it: how much to just dump vs. curate, how much to collaborate with readers on the “production” of the content, and what technologies to draw on to make the information more useful and presentable. Here’s how some news organizations are dealing with those questions as they try to max out on Palin traffic today.

First of all, it’s worth noting the e-mails were released as pieces of paper. Each news organization that had requested them received them in a total of six boxes, weighing about 250 pounds, and paid $725 for printing. Companies had the option of either picking up the boxes in Juneau this morning or paying to have them shipped to their newsrooms. (The Anchorage Daily News actually has a slideshow of photos of various reporters managing the boxes.) So before the crowdsourced analysis could even begin, news outlets had to scurry to actually get the e-mails up onto their websites.

» The New York Times has the e-mails searchable and organized by date along an interactive timeline. The site also groups the e-mails by conversation the way Google (NSDQ: GOOG) does. Each e-mail is accompanied with a comment form allowing readers to point out things of interest: “Summarize what you see and tell us the e-mail date and page number. Include your name and e-mail address so we can notify you if we include your observation in an article.” They can also tag individual e-mails as “nothing of interest here.” At the NYT’s politics blog The Caucus, there’s a liveblog of e-mail findings but no contributions from readers yet.

» The Washington Post had originally announced that it would be inviting just 100 readers to “analyze, contextualize, and research those e-mails right alongside Post reporters over the days following the release” but has since changed its MO: “We have had a strong response to our crowdsourcing call-out on the Palin e-mails. We’ve reconsidered our approach and now would like to invite comments and annotations from any interested readers.” WaPo is now posting the e-mails in sections by date. It appears to be using the same software as the Times–Document Cloud–to allow visitors to read the e-mails, but it hasn’t built in automatic commenting features or a timeline, and it’s difficult to figure out from the site where readers are supposed to comment. The newspaper has also launched a Twitter feed around the Palin emails, #palinemails.

» The LA Times also appears to be using Document Cloud software but the e-mails are randomly organized so far and only available for some dates.

» MSNBC.com, Mother Jones, and ProPublica worked with analytics and investigational research company Crivella West to launch a searchable database of all the new e-mails, and the 2,544 e-mails previously released in February 2010. The database is here. Readers can see the e-mails as PDFs or search for phrases within them but can’t comment on them through Crivella West.

» The Guardian is taking a more randomized approach: Readers click on “Show me an unread e-mail” and one pops up (well, it’s supposed to but we actually got an error page when we tried; too much traffic?) When an e-mail does appear correctly, readers can tag it. An infographic shows the progress so far:

» CNN said it will make the e-mails available “over the weekend.”

What else have you noticed about the ways that website are handling the Sarah Palin e-mails? Have you been using the crowdsourcing tools they’re providing? Let us know in the comments.

3 Responses to “The Crowdsourcing Challenge: How Sites Are Handling The Sarah Palin E-Mails”

  1. mark simon

    Thanks Laura, it is interesting to see, and I do look forward to see where this will go.  Your story was well read and I had a couple of folks email me on my comments.  Both disagreed with my belief that there was not a public service in this effort.

    We are just opening in the US and reader feedback involvement is new for us, as in Hong Kong the laws/political scene are much different.   

    I have become a big fan of paidcontent, as writing is solid and story picks good. 

    thanks 

  2. Hey Mark, Thanks for bringing up the marketing angle of this, that is a really good point and one that I’d like to follow up on in another post. It’s interesting to see how the websites are promoting the e-mails (or not) and advertising the fact that they need “help” going through them. On some levels it’s a “stunt,” as you put it, but on the other hand, I hope we’ll be seeing a lot more of this–news organizations opening up primary documents for readers to analyze, and then using some of what the readers find in follow-up articles (and citing those readers). It’s marketing in a way, but it’s also just a smart response to the ways that reporting is changing. And it’s a great way to pull individual commenters into articles (including articles in the print form of these publications) without letting the articles devolve into a sort of messy and opinionated free-for-all like you see in the comments sections on websites.

    I take your point that there’s a question about whether the Palin e-mails should be used as source material for this, but she’s an incredibly well-known public figure who’s been extremely active in social media herself, some of the e-mails are redacted and it honestly seems (to me!) like an ideal approach for this kind of thing. And if it turns out that nothing of much real substance is found in these e-mails, well, might as well give the readers the opportunity to see the source materials themselves. They may end up reading those e-mails and then agreeing with exactly what you wrote above.

    It’s all interesting, huh? I’m going to continue to follow stories like this and watch for the ways that news orgs are enlisting reader help/support/reporting in the digital era, so please keep letting me know what you think and if you have ideas. laura @ paidcontent dot org

  3. Laura, I am no Palin fan.   But this is a marketing stunt, and one that is ill thought out.  First of all 30,000 quarter page emails could be gone through by 5 reporters in two days. You make 250 pounds of paper sound large, that is just silly. This is not an overwhelming stash of documents. 

    Also, if after two days we find out the governor asked about a tanning bed, then whoa!!!!, lets go dig up that Truman treasure trove when he oversaw the White House remodel.  The nerve of that guy with a bowling alley, steam room, movie theater, and get ready for this,, a dumb waiter.   Again, I have been in at least ten Gov mansions and I know of state first ladies who spent more on new napkins than this tanning bed.  

    My real objection here is that the NYT, WASHPOST, and others are opening up a news effort to what no sane person thinks is anything but a circus for Palin haters.   We put Sarah Palin in an animation pole dancing, and have negatively portrayed her in animations a dozen times.  But that is satire.  

    I just don’t know what this is in terms of journalism, as bloggers and left wing activist groups will have access to these emails, so there is no public service in this “crowd lynching” approach.  And the whole lets “get her” is a bit much.   

    You guys reported that NYT reporters complained that Bill Keller’s rants against Huffpost were causing them problems reporting on the tech/net sector.    Wouldn’t one on a political beat worry about this?  Or do they just not care.