Blog Post

The Downside of User-generated Content

It may not be the nicest-sounding phrase, but “user-generated content” has come to be the term we use for everything from Flickr photos and YouTube videos to blog comments, Twitter posts and reviews on Yelp or Amazon (s amzn). Building a business that relies on content from your users sounds like a great idea, and in many cases has turned out to be exactly that — it’s usually cheaper than professionally produced content, for one thing (depending on what costs you include, of course). And many users care more deeply about the content they generate themselves than they do about the stuff that comes from the pros, which means deeper levels of engagement.

The problem with user-generated content, however, is the loss of control it involves. We’ve seen it play out in a hundred different ways, virtually everywhere that digital content appears, from the fight many news outlets are having over blog comments (which I wrote about on my personal blog this past weekend) to the issues that YouTube has had with repeated uploading of copyright-infringing videos. How do you get your users to generate the kind of content you want them to produce instead of the kind they want to produce?

Two recent examples of the downside of user-generated content come from Amazon and Yelp. The latter has been hit by repeated lawsuits from businesses alleging that negative reviews of their companies or retail outlets appeared after they turned down an offer from Yelp to advertise with the service. Yelp’s response, in part, has been that users post negative reviews for their own reasons, and that this is simply a function of how the service works (the Yelp blog has an official response to the lawsuits).

Likewise, Amazon has come under criticism recently for reviews of books that appear to have been posted by users who didn’t even read the book in question. As prominent investment writer and market strategist Barry Ritholtz described in a recent blog post, negative reviews of Michael Lewis’s latest book appeared before the book was even available on the market. “Considering the 1 star ratings/complaints about the Kindle edition were posted BEFORE THE BOOK was even released, they are utterly absurd,” he wrote. “Amazon needs to step up and delete these non-reviews of books. At the very least, they should not count in the book’s star ratings.”

In the Lewis case, the negative reviews appeared to be aimed at the publisher of the book, as retaliation for the fact that the book wasn’t released in digital format for the Kindle at the same time as the hardcover version. Ritholtz and other observers say these types of reviews routinely occur. They don’t seem to have any real purpose apart from simply registering a protest, since (as Ritholz points out) the behavior they’re criticizing is that of the publisher, not the author (though it’s the author who is harmed by a negative review).

The loss of control involved with user-generated content has its humorous side as well, of course, something that is best illustrated by another popular Amazon phenomenon: namely, the bizarrely hilarious comments that seem to spring to life on the most prosaic product reviews, including a cheesy T-shirt with a picture of three wolves on it (“unfortunately I already had this exact picture tattooed on my chest, but this shirt is very useful in colder weather”) and a jug of Tuscan Whole Milk (“I always find it important to taste milk using high-quality stemware — this is milk deserving of something better than a Flintstones plastic tumbler”).

For companies like Amazon and Yelp (and Facebook and YouTube and Twitter and plenty of others), user-generated content has created the mother of all catch-22s: It causes them untold amounts of misery and headaches on a daily basis, and yet without those users and the content they produce, many of these companies would be severely diminished — and in some cases wouldn’t exist at all. For better or worse, they are married to their users, and divorce is not an option.

Post and thumbnail photos courtesy of Flickr user James Cridland

This article also appeared on

30 Responses to “The Downside of User-generated Content”

  1. Oh man this is a tough one. Maybe along with the content we should have mandatory intelligence quotient tests and bias tests to judge the validity of comments.

    user created content is sketchy if you want a valid opinion but I do think you’re right you can always find the diamonds in the rough and find some genuine, well thought out, passionate information from someone who actually knew what they were talking about and are actually involved in the subject.

  2. You raise a lot of good points about user generated content. Although, many, if not most comments on media web sites are stupid, ignortant, -ist (fill in the blank) from trolls and flamers, sometimes someone comes around and writes an articulate, intelligent, and relevant response that enriches the original content of the page.

    Youtube is smart for bumping up upvoted comments which are usually of higher quality.

  3. Great points raised here about the risks of user generated content. When the risks are managed cost effectively, the benefits of user generated content far exceed the risk. To manage the risks cost effectively, I do believe a combination of “sticks”: risk management guidelines executed by the moderators and “carrots”: the right reward system to motivate users to contribute useful and relevant content is key. Take the examples of online communities dedicated to customer support (e.g., the Intuit communities, Dell communities), it is a combination of moderators who are employees and incentives that reward users for their participation (e.g., different expert designations) that motivate ongoing and useful participation.

  4. It’s worth consider how Wikipedia has managed user-generated content. Of course, Wikipedia is less opinion based than the like of Amazon and Yelp, but their “strategy” seems to be playing out very well.

    From my understanding, they have a small team of pseudo-moderators and a base of 350,000+ volunteer writers. Of course there are stragglers as well. The concept of having a moderator to determine if content is appropriate or accurate could remedy some problems; in the case of the bad reviews of a book before it’s released, a moderator could have solved that problem instantly.

    On another note, one has to commend Amazon’s efforts at keeping reviewing content strong. Their Vine program and “Top (fill in number here) Reviewer” identification promote good writers and reviewers.

      • May be I’m just having problems with authority(not that I ever), but moderators are only good at organizing “common views”(good or bad) or “facts”. I just “sometimes” have my own ideas about what is what, depending on context.

  5. I think it’s a technology problem, with the comment systems.
    I don’t want somebody else moderate for me. That can only be as good as we(moderator and me) share a common context. I for one would block the “oh that so good” or “that’s just crap” comments, also fanboy arguments on the subject which get repeated over and over. While a moderator for people new to the subject or for the ego of the writer might let them in.

    The main problem is how we structure text, boolean is not a good tool to organize ambiguous text. I personally prefer flow analysis, actually TTL. For example the comment from SkimlinksJenny just yells “sales pitch”, or at least that is what my system tells me. The problem is it takes an incredible amount of CPU (parallel) power to analyze text that way, and Google and Microsoft are stuck in 19/20 century thinking of faster boolean algorithm (keyword seq)to solve that porblem. But with the current trend of more cores in every CPU generation there’s hope.

    In Amazon’s case it’s pretty clear the raters are not talking about the book or author, even their crappy system should be able to handle that. Come on, distinguishing between an author rating, book rating and publisher rating can be done with boolean logic pretty easily.

    • Anthony

      The problem with algorithms checking written text has noting to do with computing horsepower, but rather with the reality that there will still be a need for humans to constantly double check the results. Spelling errors, grammar errors, slang, idioms, may all trip up the algorithms. Without a constant human sampling and validation of the results, there is no other way to tell how well the algorithms are working, unless one wants to wait for angry feedback from users (like the Yelp reviews disappearing.)
      We do a considerable amount of software testing, and while we make heavy use of automation to run the test and perform analysis of the results, at the end of the day it is humans who have to review and approve the results. No customer wants to hear excuses about automation when a problem occurs on site – they blame the people, not the computers.

      • Oh really:

        Simple example: I see a tree.

        Is ‘a’ an abstract learned by the system? Or is it “assigned”? What if I change that to ‘I see the tree’. New programming required or has your system leaned abstracts and can distinguish between them.
        Now if we are in a programming office setting. Talking about structures. Is “tree” an CS description of an abstract or the tree outside the window?

        The CPU power doesn’t go into analyzing words, it goes into analyzing abstracts. Which the system has to build/learn on it’s own to “understand” the problem at hand. The problem is learning, not programming every possible combination of every possible usage of “a” word. As long a we a focusing on programming we will get stuck with what you describe. I see a trees, I see the trees. Now what.
        Rewrite? Why shouldn’t a system be able to learn to handle this?

  6. I’ve been running couple of wordpress sites where I allow users to submit their relevant content . Once reviewed by our staff, it gets published. if there is any quality/copyright issue. We try to solve it with user. I’ve been looking for minimizing the review effort involved without losing the control. Looking for your suggestions for this.

  7. Certainly paying to allow content to be posted is common. Ideally you want to give your site users the biggest voice you can. So any moderation has to be done very selectively, and with explanation of why it might have happened. Transparency! (yes, can I repeat that!)

  8. We faced this problem at WineAlign. We designed our business around the fact that users would want to contribute and share reviews/ratings of wines, these would vary widely in quality, and often be influenced by the wine consumption itself (i.e. This is the best wine I have ever had (hic)!). Our solution was that if the user pays for the service then we ensure that reviews from professional wine critics are weighted much more heavily than the general user reviews when creating wine rankings. Therefore mitigating the UGC effect while still encouraging it’s creation.

    I’m not sure if any other business has approached the problem this way.

  9. The issue with yelp is the lack of transparency in the process of vetting the reviews. Good ones come down, and there’s no transparent explanation. In social media, transparncy is the key.

    • I agree, Brian — I think part of the problem is that Yelp has an algorithm that removes posts it thinks are spamming or otherwise malicious, but it doesn’t really explain that very well when it happens.

  10. I would add to the Amazon examples. Most helpful review: Solved Global Warming Locally. It’s not a problem of user generated content – it’s a problem of trying to sell a plain Ethernet cable for $999. The product just gets the reviews it deserves – and it shows the power of the user content – honesty.

  11. This is nothing new, Mathew. When the term “user-generated content” encompasses blog comments, video uploads and reader reviews, it can be misconstrued to mean anything. The problem is not with the user, it’s with the publisher and their responsibility to moderate the content. It’s not about “user-generated content” being a necessary evil for Facebook and Twitter and YouTube. It’s about those companies figuring out a way to moderate bad content out of the conversation so it doesn’t ruin the experience for other readers and users. You rarely hear complaints about comments, mainly because they are all moderated. Yelp and Amazon need to pay attention to moderation or people won’t trust what they read there.

  12. Terrific post! There’s another model for harnessing user-generated content that not only spurs trust in the content, but creates extra value that people will pay for. It is – where people rate local service businesses, from plumbers to doctors. You have to be a member (from $20-50 a year) to rate a business or use the list. More than a million people have joined in over 200 cities – so they are pulling in roughly $30 million from membership alone, plus ad revenue that is restricted to those companies that have the highest ratings. I spoke to members who told me they are happy to pay because they trust the user reviews knowing they come from people who are committed to supporting and using the list. They appreciate having a pay barrier. Angie’s List has also created algorithms and a small team of people to look for and investigate suspicious reviews and weed out people trying to scam the ratings.
    Seems that we are so focused on free review sites that we may be missing the value created when people have to commit, with a few dollars, to joining a community and maintaining the integrity and value of their collective created content. I’m looking at new models for the news business and I think membership – real membership that expects something from members, not just donations – has high potential for sustaining news organizations.

  13. There’s always unintended consequences but trying to place false comments [as Yelp is accused off ] reflects very badly on the company, it brings their credability down, so in a way it is a self inflicted wound.

      • Scottix

        Of course they are going to deny it. It is all reputation. The problem is the lack of evidence. I think there are too many unknowns. Maybe it wasn’t Yelp, and they were just targeted to get down rated. Yelp can bring great publicity and it can also bring negative publicity. Fair or not there should be an opt-out option. The business needs some control in their promotion.