Blog Post

Is this plagiarism? A new web extension can help answer that question

Stay on Top of Enterprise Technology Trends

Get updates impacting your industry from our GigaOm Research Community
Join the Community!

Suspicious about the origins of an article you’re reading online? A new browser extension and website, Churnalism U.S., claims to help detect plagiarism by comparing web content to Wikipedia and a database of press releases.

Churnalism was built by the Sunlight Foundation, a Washington, DC-based nonprofit that aims to make government more transparent and accountable, and Media Standards Trust, a U.K.-based nonprofit that advocates for transparency in news. The organizations previously created a U.K. version of Churnalism that compares web content to articles from the U.K. national press and the BBC.

“Here at Sunlight, we’re increasingly interested in tracking not just the flow of money in politics, but the flow of ideas, whether in legislation or floor speeches or news articles,” Sunlight Labs director Tom Lee said in a statement. “When we learned of what Media Standards Trust developed, it seemed natural for us to help them bring it to the U.S. news consumer.”

Churnalism U.S. is available as a web extension for Chrome (s GOOG), Mozilla Firefox and Internet Explorer (s MSFT) browsers, or users can simply paste a URL into a website. The service then highlights possible similarities between the article and source material from Wikipedia and press releases. On its blog, Churnalism explains a little more about how the technology works to detect plagiarism. The database of press releases includes PRNewsWire, PR NewsWeb, MarketWire, EurekaAlert, Congressional Leadership and press releases from the White House, trade organizations, Fortune 500 companies and nonprofit research institutes and think tanks.

Because Churnalism U.S. is only searching Wikipedia and press releases, it doesn’t detect “classic” forms of plagiarism — an author copying another author’s original content from somewhere else on the web or in a printed work. Churnalism doesn’t pick up (yes, I checked) Atlantic writer Nate Thayer’s failure to credit his sources, for example, or Jonah Lehrer’s self-borrowing. For that, you’ll have to use a paid tool like Turn It In. But Churnalism plans to open up its API soon so that users can add more sources.

4 Responses to “Is this plagiarism? A new web extension can help answer that question”

  1. Bob LeDrew

    The argument could be made that lifting content directly from news releases is not plagiarism. The idea of news releases is to inform journalists of an event or an action by an organization.

    There is a decades-long tradition of news outlets using news releases with either minimal rewrites or as written in their stories, both print and broadcast.

    If plagiarism is the unauthorized and uncredited use of another’s writing, then the use of a news release in a story seems to me to fail the “unauthorized” part of that test.

    What about taking a CEO quote from a news release and running it directly? Is that plagiarism?

    • Bob, that is a good point (for example, I included a quotation from the Sunlight Foundation’s Tom Lee, that was in the Sunlight Foundation press release, in the story).
      There is a little more on how this works at Churnalism’s blog (

      “Once we have a list of which press releases share text with a given news article we have to analyze whether that shared text is meaningful. This is where the Churnalism web frontend takes over. We remove fragments that are mostly long proper nouns (such as ‘the President of the United States of America’). We then measure how many characters overlap and how close together the shared passages are, relative to the document lengths. A 3,000 word news article that shares two sentences with a press release is less interesting than a 1,000 article that shares two paragraphs. Similarly two articles of the same length that share the same two sentences with a press release aren’t always churning the press release to the same degree. We boil this down into the ‘density’ of the shared text in the two documents as a measure of how likely the text was simply copy/pasted and then slightly edited.”

      FWIW, I ran this very paidContent story through Churnalism and it doesn’t pull up any flags because of that Lee quote.

    • Trace Cohen

      I love this site coming from a PR/Journalism background!

      Yes in PR we don’t mind if our news release is copy and pasted word-for-word but it would be nice if they actually took the time to write a real story. It’s not plagiarism as we want to get the news out but just something we accept now with shorter deadlines. It’s a decade-long tradition because of the broken media business models – no fault to the writers, just their overall business as a whole.