Jimmy Wales, the founder of not-for-profit Wikipedia and for-profit, San Mateo, Calif.-based Wikia is part of a growing number of people who are discomforted by the growing control Google has over search. And he is doing something about it. His company, Wikia, last week bought the […]

Jimmy Wales, the founder of not-for-profit Wikipedia and for-profit, San Mateo, Calif.-based Wikia is part of a growing number of people who are discomforted by the growing control Google has over search. And he is doing something about it. His company, Wikia, last week bought the distributed crawler Grub from LookSmart and plans to make it available in open source. Not that LookSmart was really using it anyway — and they also did ad business with Wikia.

Wales’ bet: like Linux became a migraine for the monopolist of the last generation, open-source search tools will keep companies like Google honest. It is not an easy task, for Google is firmly embedded into our digital lives.

“Search is part of the fundamental infrastructure of the Internet. And, it is currently broken,” Wales said back in December 2006, when Wikia launched Search Wikia effort. “Why is it broken? It is broken for the same reason that proprietary software is always broken: lack of freedom, lack of community, lack of accountability, lack of transparency.”

Wales launched Search Wikia earlier this year, and the Grub acqusition is part of that strategy. (You can run Grub on your Windows or Linux-based PC, either in the background or as a screensaver.) Following the announcement, we spoke with Wales, who outlined that with Grub, and other tools such as Lucene, an open-source indexing software, innovation around search can thrive.

By marrying these search results and the human context provided by Wikia wikis, the final search results could actually become useful once again. Grub, Lucene and Nutch (a web crawler based on Lucene) are the powder and spark of the open search revolution.

Grub is not by any means the final move, and should be viewed as a first concrete step in a long-term strategy. Jeremie Miller, inventor of Jabber and XMPP protocol, who is leading the Search Wikia efforts (and also CTO of Wikia) gave a talk at OSCON about the architecture of open-source search. Miller pointed out that the monolithic search can be broken into three components, and interested parties could implement one or more of the three components.

The three components are – factories that crawl, present and present content; collectors who rate and rank content from multiple sources; and brokers who direct user queries to the collectors or factories. Miller believes that this is a five-year process. Grub is one of the many components that will be needed for building a truly open-source search infrastructure. The biggest hindrance to any search start-up taking on Google (or Microsoft, Ask or Yahoo for that matter) is the high cost of infrastructure.

Sure Amazon’s EC2 service has helped, but it isn’t enough. Google, thanks to its money machine, has been able to build an infrastructure that lets it crawl, index and show results at a faster pace. Even if a start-up comes up with a better alogrithm, it still needs to sink millions into infrastructure to just get into the business, and offer as fast of an experience as most people associate with Google.

Grub, on the other hand, is a way to build a massive, distributed user-contributed processing network. Another nascent but promising open-source P2P search engine, Yacy, coming out of Germany. (Also check out Faroo, a German P2P search start-up.)

Can it work?

Wales faces an uphill climb. First he has to ensure that there are enough people using Grub, and are more importantly, are hacking enhacements to the software. At the same time, he has to address other concerns, as pointed out by this commentator on the Search Engine Land and other blogs.

While Google might be impossible to beat in a full-frontal assault, it is vulnerable to smaller, more focused attacks. While Linux may not have been able to kill Microsoft, it has stolen opportunities from the OS giant. It has been particularly effective in the Internet infrastructure (data centers.)

Open source search can do precisely the same – take away opportunities from large search engines. Perhaps, like with Linux, we will see a shift away from Google, and venture capitalists, for long scared by the prospect of competing with Google, will loosen their purse strings.

If Linux ended up spawning devices as diverse as TiVo and mobile phones, open-source search can lead to many more specialized search engines, also called vertical search engines. Today, the cost of building a good vertical search engine is millions of dollars. However, building and operating a vertical search engine is not for the faint of the heart.

In an interview with Fast Company magazine earlier this year, Wales quipped:

“The other thing we’re looking to is some of the second-tier search companies,” he admits. “We’ve talked to–I can’t say who–different people, asking, would they be better off participating in a project that helps quality search results to become a commodity?”

Put it another way – Wales is hoping for death by a thousand cuts to the search incumbents.

More @ Resource Shelf.

  1. Great article, very exciting stuff. You should fix the link to Grub, however. It’s grub.org not grub.oom. Also, looks like you forgot to close the link tag after nutch.org because it goes on for several paragraphs.

  2. Andrew Parker Monday, July 30, 2007

    You left an anchor tag open for multiple paragraphs.

  3. Wow. I agree a change to that extent will take time, but with the movement towards open source “everything” the internet might be ready to handle this. The real question is how far are hackers and software developers willing to go to really make this mean anything at all against giants like google?

  4. Andrew,

    thanks for the catch. fixed it. something went wrong when posting in wordpress from my blog editor.

  5. Just when I blast you , you come out with a good albeit dated article.


  6. You need an editor. The grammar mistakes make this barely readable.

  7. The big issue with Wikia in particular is how many users will it find who are willing to contribute, considering that it is a for-profit entity. I for one, contributed heavily to Wikipedia, but will not touch Wikia.

    On the other hand, Grub looks really interesting, and should be fantastic. However, I think Google wont be too worried, since it is good enough for most people, and it is far too entrenched in our collective mindset.

  8. [...] Wikipedia founder Jimmy Wales would like you to help him build the revenue for his new “for profit” venture Wikia. Wikia has acquired the distributed crawler Grub from LookSmart and Wales plans to make it open source. He’d like to invite the community to line his coffers. [...]

  9. Google vs Jimmy Wales & Open Source Search

    This story has been submitted to Stirrdup. Your support can help it become hot.

  10. I bet Google is laughing hysterically now. Jimmy’s biting off more than he can chew. From the outside everybody’s a genius.

    “You need an editor. The grammar mistakes make this barely readable.” He’s needs to learn how to write. In the short deadline world of blogging, the bloggers need to know grammar, know when to look things up in dictionaries (hyphen? open? closed?), and know the AP or NYT style manual.


Comments have been disabled for this post