7 Comments

Summary:

With AOL, Demand Media and Yahoo all investing heavily in creating huge networks, “content farms” are clearly here to stay. But how far can they go? A team of journalists and computer scientists is conducting an experiment to see if the news can be completely automated.

Battle Of The Bots: iRobot Sues Rivals

In the media industry right now, there are few more terrifying words than “content farm” — a term used to refer to sites that thrive by producing vast amounts of cheap, disposable content. From a traditional media business perspective, everything about content farms seems terrible: Low-paid staff churn out low-quality content that’s often devoid of useful information, and typically aimed not at enchanting readers but at scooping as many of Google’s advertising dollars as possible.

It’s big money: Demand Media went public recently, and AOL’s leading the charge with its low cost Seed.com service and, arguably, its $315 million buyout of the Huffington Post. Journalists roll their eyes at this stuff, but content farms — for better or worse — are hard to ignore right now, driving down prices and flooding the market with stories.

But what if you took the content farm model even further — by crowdsourcing everything? That’s the objective of My Boss Is a Robot, an experimental exercise to see how automated the process of journalism can become. It’s being overseen by two San Francisco-based journalists, Jim Giles and MacGregor Campbell, who want to understand what automation and crowdsourcing mean for journalism in the long term.

Specifically, the duo want to see if they can “create an automated system for producing quality journalism using Mechanical Turk’s army of untrained workers.” This involves breaking down the familiar job of journalism — in this case, producing a 500-word report on a relatively straightforward scientific paper — into little pieces, and then getting those pieces completed through Amazon’s Mechanical Turk service.

“We were interested in what you might call crowdsourcing 2.0,” Giles — who has written for outlets including New Scientist, the Economist and the New York Times — told me. “Rather than do simple, one-off tasks on Mechanical Turk, could you break down a big task into lots of small ones? We kind of feel like people are close to doing it already.”

Working alongside a team from Carnegie Mellon, they are trying to develop systems to test whether it’s possible. One job might include reading a few paragraphs of information to determine the most important fact about a research paper. Another would be identifying researchers who might be interested in similar areas; then the most appropriate ones; then emailing them asking for comment.
Of course, it’s all theoretical right now — an exploration of what is possible, even if it seems improbable — but there’s a precedent.

Aside from actual content farms, CMU’s Niki Kittur — who is leading the technical side of the process — has already done work to automate the writing of Wikipedia entries by breaking them down into 30 or so tasks that can produce passable content for around $3 a time. Giles suggests it will take anything up to 100 tasks, potentially producing work that is “at least an order of magnitude cheaper than a professional journalist.”

Despite the fact that it’s an experiment that’s meant to comment on the troubles of content farming as much as the ills of journalism, Giles says he’s already had some negative responses from reporters who are unsettled by the prospect or worry that it will “put them out of a job.” Nobody knows whether it will work or not, but it’s almost certain to hold a mirror up to both sides of the content farm argument: saying something about the mindless and exploitative nature of low cost content, while also commenting on the humdrum and (yes) robotic nature of so-called “churnalism.”

“I am scared of the way people some might use it — Demand Media have already automated some parts of the journalism process, for example,” he says. “But if you imagine that it works, and what we’re producing is passable, then you’ve got to start asking yourself questions. If it’s as good as those other 500-word stories, then what does that say about them? If it can be automated, then someone is going to do it — so perhaps we should concentrate on doing something better.”

Of course, there are caveats. There’s lots of trial and error, and they are only looking at simple, uncontroversial reporting. There are also likely to be plenty of pitfalls along the way, and as such, Giles is looking for feedback on the My Boss Is a Robot blog or via Twitter so that he can document reactions and understand what challenges the scheme poses. But the real proof of the pudding will come in a couple of months, when the system is ready to run for real — and can be compared against a real live journalist. “When we get to the end, we’ll do a blind test,” he says. “And then I’ll receive my notice the next day.”

Related GigaOM Pro content (sub req’d):

You’re subscribed! If you like, you can update your settings

  1. This is very fascinating. If indeed this is able to work it would save a lot of “headache” caused by the content farms. But on the other hand I can see why Journalist across the world would not be for this idea. This seems like it could easily blow up and have journalists everywhere searching for a new job in an already tough job market.

  2. Ultimately the real issue is whether advertisers will pay to be on this content or how the site is getting monetizing.

    Take Demand Media their goal is owning the first page of Google. Will google give them the traffic and the revenue tool allow this to happen forever?

    Example:

    Owning the first page of Google results.

    Google “How to Renew an Expired Passport” they have multiple sites they own or partner with to own the first page.

    Search result #1 Ehow/Demand Media
    Search result #2 Ehow/Demand Media
    Search result #6 UsaToday/Demand Media
    Search result #10 Trails/Demand Media

  3. Are you accepting comments?

  4. It is interesting to see that we try to automate the human out of our
    process. We believe the model will go the other way and be more driven up the food train to more cost effective professional content. It is the personal touch that comes through over time. Think of when music could be synthisized. It started with everyone saying that musicians were no longer needed. This didn’t play out. Same will be true with journalism and as well video production.

  5. Robot News « Wir sprechen Online. Tuesday, February 8, 2011

    [...] My Boss is a Robot: scientific experiment outsources the editorial process to Amazon Mechanical Turk; http://eicker.at/RobotNews [...]

  6. Mr. Toole you make a very good point with your music example. But some of the “new age” music has infact got to the point where it’s not instruments playing anymore but just sounds and electric beats; as seen in dub-step, electro, and even rap genres. I think it all comes down to stepping stones. Many things evolve over time. Music may just be in the mits of its transformation and if “My boss is a Robot” actually works, we could be saying the same for journalism as well.

  7. FredCavazza.net > Le retour de la revanche du contenu Friday, February 18, 2011

    [...] Un modèle intéressant qui a su séduire de nombreux clients, mais laisse quand même dubitatif : Content Farms 2.0, Can Robots Help Write the News?. Toujours est-il que les moteurs en question ne sont pas dupes et travaillent activement à [...]

Comments have been disabled for this post