The Crowdsourcing Challenge: How Sites Are Handling The Sarah Palin E-Mails

Today at 9 a.m. Alaska *Time* (1 PM EST), the Alaska state government released some 24,199 e-mails between Sarah Palin, Todd Palin and Alaska public officials from Palin’s days as governor — the emails were on paper and in boxes, still needing to be scanned. Sites across the web, from the New York Times (NYSE: NYT) and the Washington Post (NYSE: WPO) to Mother Jones and *MSNBC*, swung into action. Details from the e-mails hit Twitter within minutes and the first batch was scanned within half an hour. Already, we’ve learned that the Palins wanted to install a tanning bed in the governor’s mansion in Juneau.

But there’s something else — beyond the salacious nuggets — that’s notable about the Palin e-mails.

News organizations have plenty of opportunities these days to share massive amounts of data and information with their online readers, thanks in part to efforts like WikiLeaks. But they also have to figure out how to present it: how much to just dump vs. curate, how much to collaborate with readers on the “production” of the content, and what technologies to draw on to make the information more useful and presentable. Here’s how some news organizations are dealing with those questions as they try to max out on Palin traffic today.

First of all, it’s worth noting the e-mails were released as pieces of paper. Each news organization that had requested them received them in a total of six boxes, weighing about 250 pounds, and paid $725 for printing. Companies had the option of either picking up the boxes in Juneau this morning or paying to have them shipped to their newsrooms. (The Anchorage Daily News actually has a slideshow of photos of various reporters managing the boxes.) So before the crowdsourced analysis could even begin, news outlets had to scurry to actually get the e-mails up onto their websites.

» The New York Times has the e-mails searchable and organized by date along an interactive timeline. The site also groups the e-mails by conversation the way Google (NSDQ: GOOG) does. Each e-mail is accompanied with a comment form allowing readers to point out things of interest: “Summarize what you see and tell us the e-mail date and page number. Include your name and e-mail address so we can notify you if we include your observation in an article.” They can also tag individual e-mails as “nothing of interest here.” At the NYT’s politics blog The Caucus, there’s a liveblog of e-mail findings but no contributions from readers yet.

» The Washington Post had originally announced that it would be inviting just 100 readers to “analyze, contextualize, and research those e-mails right alongside Post reporters over the days following the release” but has since changed its MO: “We have had a strong response to our crowdsourcing call-out on the Palin e-mails. We’ve reconsidered our approach and now would like to invite comments and annotations from any interested readers.” WaPo is now posting the e-mails in sections by date. It appears to be using the same software as the Times–Document Cloud–to allow visitors to read the e-mails, but it hasn’t built in automatic commenting features or a timeline, and it’s difficult to figure out from the site where readers are supposed to comment. The newspaper has also launched a Twitter feed around the Palin emails, #palinemails.

» The LA Times also appears to be using Document Cloud software but the e-mails are randomly organized so far and only available for some dates.

», Mother Jones, and ProPublica worked with analytics and investigational research company Crivella West to launch a searchable database of all the new e-mails, and the 2,544 e-mails previously released in February 2010. The database is here. Readers can see the e-mails as PDFs or search for phrases within them but can’t comment on them through Crivella West.

» The Guardian is taking a more randomized approach: Readers click on “Show me an unread e-mail” and one pops up (well, it’s supposed to but we actually got an error page when we tried; too much traffic?) When an e-mail does appear correctly, readers can tag it. An infographic shows the progress so far:

» CNN said it will make the e-mails available “over the weekend.”

What else have you noticed about the ways that website are handling the Sarah Palin e-mails? Have you been using the crowdsourcing tools they’re providing? Let us know in the comments.