Blog Post

Cpedia Founder Says Errors Are "Intentional"

Stay on Top of Enterprise Technology Trends

Get updates impacting your industry from our GigaOm Research Community
Join the Community!

Cpedia, an attempt to create automated encyclopedia-style articles from search results, was recently launched to less-than-enthusiastic reviews (including one from me). The encyclopedia was created by Cuil, a search engine that also got a less-than-positive response from users and reviewers when it launched in 2008. You might think that after the rhetorical beating Cuil (pronounced “cool”) took when it emerged into the world, founder Tom Costello might have developed a thick skin when it comes to criticism. But you would be wrong. In a long blog post responding to the bad reviews for Cpedia, the Cuil CEO — who created the search engine with his wife, former Google executive Anna Patterson — lashed out at his critics, calling them “vituperative” and “haters.”

Costello suggested that most of the criticisms came from writers who searched for their own names, but just aren’t that noteworthy, saying:

“Cpedia does very badly with people who write much more on the web than people write about them. Given the 1 billion people on the web one might think this unlikely, but it happens. When we try to summarize the information mentioning these people, we run into a problem. Almost none of it is about them. It’s about random things they have opined on. Dave Parrack, Farhad Manjoo, Louis Gray, I’m talking about you.”

The other complaint (which was the central point of my post) was that the entries simply didn’t make any sense, even when they were about someone well-known enough that there was plenty of information to pull together. In response, the Cpedia founder launched into a bizarre description of how the Christian Brothers who taught him Irish when he was a child used to beat him with straps until he got his vocabulary right, and how his Irish was technically correct but had no “blas.” That’s apparently an Irish term for the polish that players of the Irish sport of hurling get on their sticks after playing for a long time (I’m not sure that’s correct though — Wikipedia says the top of the hurling stick is called the “bas,” and an Irish dictionary says the word “blas” means “taste”).

[related-posts align=”right”] Costello also says that what Cpedia is doing is *not* trying to pull together all the information about a topic and make sense of it — he says it’s trying to find the undiscovered, unique pieces of information, such as the fact that a VC he was meeting with apparently “has a tendency to over-imbibe.” Because the encyclopedia’s engine removes duplication, “unique ideas have more chance of coming to the top,” he says. And finally, Costello says that Cpedia “has errors” and that this is “intentional,” because “we have tried to be inclusive, and dredge to the bottom of the web.”

So if what you’re looking for is an automated encyclopedia entry that doesn’t make sense of things, intentionally has errors, and dredges the bottom of the web, then Cpedia has got what you need.

Some commenters on our post and on Twitter said that criticizing Cpedia was unfair, and that, as Hunch co-founder Chris Dixon put it, the company was trying to solve an interesting problem. And there’s no question that trying to turn search results into automated, encyclopedia-style articles is a hard problem. Will Cpedia get better and eventually solve that problem? Perhaps. But it’s a long way away from that right now.

Post and thumbnail photos courtesy of Flickr user acordova

8 Responses to “Cpedia Founder Says Errors Are "Intentional"”

  1. This is the key message in Tom’s post: “When people search the web for information, a lot of times the first few results do not contain all the information there is about the subject. Almost no one can continue through all the other pages, because they are almost all regurgitations of the same material, with perhaps a few extra nuggets. Cpedia processes all the pages about a topic, and extracts the unique ideas.”

    Cpedia is not about collecting and organizing knowledge — it’s about speeding the discovery process for the user.


    Google’s search results are cluttered with so much “brand value” crap that you often have to change your queries or scroll through pages of database-served, keyword-injected drivel from all of Google’s favorite sites before you get to the unique content.

    There’s a lesson to be learned here on both sides of the equation: first, Cuil needs to figure out what its real value proposition is and put that forward; second, Cuil really is providing value that Google just doesn’t seem to have figured out.

    Cpedia is nothing to Wikipedia because Wikipedia’s “facts” are constantly changing — you cannot trust what it says. At least Cpedia’s facts won’t change until the unique content on the Web changes. Cpedia is about what is actually on the Web; Wikipedia is about being Alice in WebWonderland with no narrator to help you figure out why everything looks psychodelic.

    If only they had announced Cpedia this way in the first place.

  2. Let me get this right: you do not speak Irish but you presume to criticise the guy’s knowledge of his own language based on what you can find with a search engine? And the guy has a PhD from Stanford in computer science and he works on search?

    Criticise the search results all you like but that is offensive and unworthy.

    In any case, it seems you’ve missed the point. My curiosity was aroused by your post so I went to read the original blog post. It addresses the issues pretty fairly.

    Your headline and interpretation are dishonest. The article doesn’t say that “errors are intentional” in the sense of deliberate, but that some are unavoidable in a maximally inclusive search and that, in effect, a tradeoff is involved.

    It’s the 2nd deliberate misreading I’ve read today. Here’s the other:

    Are you a graduate of the Sarah Palin school of journalism? Of any school of journalism?

  3. The interesting thing from my point of view is that I do think the duplication of search results is a particular problem with search engines that has not been adequately resolved yet. Sure Google presents 15 pages of results, but how many people go beyond the first 3/4 when looking for a relevant and useful link? More likely is rewording the search to try and get more targeted results.
    It would be great if his system for condensing results to make options more unique actually worked – unfortunately I agree with the consensus that Cpedia hasn’t actually found a useful way yet to make search results that much more useful then they currently are

  4. I’m amused to be listed as a “hater”. I said CPedia was to Wikipedia as Cuil was to Google. The results are ridiculous. Do vanity searches fail? Of course. But so does everything else. It’s a mess.