It's All In The [Search] Packaging


needle_haystack_2.gifBack in the 1990s, searching for something on the web was like looking for a needle in a haystack. Then Google (GOOG) came along, and it was as if someone had handed us a magnet with which finding the good (and the relevant) became downright easy.

The post-bust social media boom, however, is bringing an end to the good times. The explosion of content on the Internet is making searching for information difficult once again. It is one of the reason we are seeing a sharp increase in the number of companies — vertical search engines and wiki-based collaborations, for example — that want to help us find the information for which we’re looking. This trend – lets call it smart content aggregation – is something I wrote about back in March for Business 2.0. [digg=]

These companies are trying to insert themselves into the proverbial three-page paradigm that has been popularized by Google: page one is the search box; page two, the search results page; and finally, page three, the final destination that holds the coveted information. It’s the “second page” that’s getting crowded, and it’s forcing average web surfers to look for simpler options.

One option is Jason Calacanis’ Mahalo, which uses wiki-software to create specialized topic pages, aggregated by editors, that quickly point searchers to the best web resources available. (Of course, from an advertiser perspective these topics are also amongst the most sought-after “keywords”) He isn’t alone in trying to insert his company in between the searcher and the original content source; Yahoo (YHOO) has been experimenting with aggregation mashups as well.

Earlier this week, Kosmix, a Mountain View, Calif.-based startup that is taking an algorithmic approach to aggregating content, launched two beta versions of specialized topic-based pages, RightAutos and RightTrips; it also formally launched its health-focused property, RightHealth. The company builds specialized start pages on topics within topics, (such as osteoarthritis in its health section), that include everything from videos to news to special reports, all supported by advertising. Kosmix claims its beta version of RightHealth currently gets more than 2.5 million visits and generates 9 million searches a month.

Wikia, another company that is aggregating content, is now doing about 250 million page views. Wikipedia, the biggest such operation (and a not-for-profit org) had become the 9th-largest site on the web in terms of unique visitors, according to comScore.

These traffic trends reflect a desire on the part of web surfers to find information smartly aggregated for them. But those surfers still want to start with Google. Hitwise tells us that in August, 47.2 percent of Wikipedia’s traffic came from Google, up 8 percent over the same month last year. Meanwhile, Mahalo, which launched in May, saw 53.3 percent of its traffic come from Google in August, a 49-percent jump from July. For the same month-to-month timeframe, Google sent over 8.37 percent of Wikia’s traffic.

While the jury is out on the success of the aforementioned startups, the need for smart aggregators is only going to increase as more content starts to come online. As I wrote in the March issue of Business 2.0:

Hyperaggregation is simply a way to do in the new-media world what old media has done for centuries: neatly package information. The value of a newspaper, after all, is not the information inside as much as the carefully considered layout of the front page. At a glance you can see what’s important. Smart new companies are finally figuring out how to do this online, where there’s too much content and not enough packaging.

The aggregation is going to pose a challenge for some of the traditional content sources. A lack of finely tuned information sources is one of the reasons content publishers get “wasted” clicks. The aggregators can take away some of that sloth by making seek-and-search more efficient. Wikipedia’s pages, for example, are incredibly detailed and often include enough information that obviate the need to search any further. That should be a scarier prospect than Google adding news wires to Google News and subsequently taking a bite out of newspapers’ online traffic.


Greg Linden

Om, on your quote from your article in Business 2.0 magazine:

“At a glance you can see what’s important. Smart new companies are finally figuring out how to do this online, where there’s too much content and not enough packaging.”

I think one question to ask is whether the one-size-fits-all model necessary for off-line publications like newspapers makes sense for online aggregators of news.

Online, there is an opportunity to create a different package for each individual, personalizing what is important. In new media, information could be packaged on a one-to-one basis.

Andrew Thomas

Mitchell Quinn said: “not to mention so called “consumer review” sites such as Ciao, Kelkoo, Pocket-lint, etc, which provide poor quality information on the hope of a sales hit.”

Absolutely. Why doesn’t Google just ban these useless market comparison sites from its search results? They’re just another layer of crap we have to wade through before getting to what we want.

Gary Prosser

The challenge for software tool developers in search is emulating human judgement. In the meantime its interesting that (Open Directory) which is useful but limited, is still ranked 471 (by Alexa). Goog could usefully think about encouraging and cooperating with vertical search sites that have a strong human, therefore contextualised data, element.

Lydia Porter

Back in the 1980s, searching for information was something you needed a specialist for. Someone who was trained and experienced in not only knowing how to interrogate specialist databases, but how to identify the most reliable and valid results. When Google came along, most people decided it was easy enough to do the interrogation and validation themselves. You want to take a chance on unreliable, biased, inaccurate information, try Wikipedia. You want aggregation? Try a library. Maybe what is required is the specialist services of a trained human being again. Try a librarian.

Don Jones

Goog has simply established itself as the technologically best solution for the most number of people. It doesn’t preclude vertical sites from specializing to provide even more relevant results, say for venture capital, than Goog’s 63,900,000 results…

Brent Hopkins

One important consideration which some of the other commenters have sort of hinted at is exclusion of unwanted information or results. Something as simple as a button to “exclude this site from future results” would go a long way toward eliminating the dross. Google’s Web History feature would seem to position them as a natural leader in this area. I have even suggested to them on several occasions that a simple user-ranking of results and particularly viewed results would be very powerful. Apparently that doesn’t fit their business model as an advertising company. Or maybe its a dumb idea, but I would like to see someone give it a try.

Mitchell Quinn

It’s not the exposion of information which has made searching difficult, it’s the exposion of web bottom feeders, such as link aggregating sites with automatically generated pages which are nothing more than scraped scraps of text and associated advertising, not to mention so called “consumer review” sites such as Ciao, Kelkoo, Pocket-lint, etc, which provide poor quality information on the hope of a sales hit. In general I have no problem finding the information I am searching for unless my item of interest happens to fall into the hit zone of aformentioned sites, in which case I have to manually trawl through pages of garbage from companies who provide nothing of value to anyone.


Oh boy.
All these people throwing around terms without a clear definition.
human brain != boolean computer
human information = data in context
human context = related (data,events,knowledge, emotions …)
[ very simplified, but I don’t want to break it down to Consciousness]

In other words if you don’t know what I’m working on, you provide data to me. Which I put into context, thereby creating information.

Now let’s plan a business trip to Mountain View.
I create a meeting in my Calendar with a location in Mountain View. The system creates a Workspace on my Computer under which all data for this meeting will be aggregated and preserved from now on. If I get flight confirmation the system checks the date and relates it to the meeting and ….
In other words if I go on my system to a specific Workspace all information for this context are there. Now if I share this with a back end server wouldn’t the search be much more accurate?

Come on now, Microsoft could build this in 2 years if they would use their dead brains. Google will struggle, since they have no local context. But they are working on it. And the rest …?

Ok I will get a coffee now, getting far to agitated .


It depends on what stage of Google you’re talking about. To begin with it was certainly a revolution in terms of search. However, as people learned to manipulate it, and it seems to focus more heavily on the advertising that keeps it in business, the quality of its results have gone down.


Working for vertical, B2B, search / directory site I know that there are times where our site will have information available about a company that doesn’t even have a website. The role of specialized or vertical search sites will continue to grow as 2.0 content and apps take root.

surya narayana saripalli

Will there be digital divide in providing separate servers for WEB-2.0,and web-3.0 Technologies.
The answer is pending as some more navigational[air][personal in transit in traffic],and personalized web pages are going to jam the Net.

Abhay Shete

Have been a regular reader of this blog. Wikipedia has become a great site to access “information” rather than the “search-find-click paradigm” as you mentioned. We are a three people startup based in Pune, India. We have been working on an “information-engine” which delivers “information” to the end users, rather than links of web-pages. Currently Wikipedia is manually driven, we have created a system which automates and rapidly accelerates the creation of information. To give an analogy, Wikipedia is like the early automotive industry, the assembly line was manually operated. Our system is the equivalent of a modern automated assembly line. The role of humans changes to setting parameters and inspecting results.

We have also used interesting tools to present this information to the end users. We just had an internal alpha release and would love to share some search results and further details with you.


Julio Franco

I say it won’t (replace Google). The reason Wikipedia is so useful is because it’s a central place for “everything”, just like Google is, and the only way they are able to handle is with heavy user input.

If you try aggregate content on your own (even with a huge staff), you end up with an clone. Then if you have various dedicated portals for niche topics, how is that different from the current Internet?

We already have quality sites offering niche content, and those who plan to stay in the longer run better adapt for increased user interaction and input. Isn’t that what we are looking at and still heading towards with the next generation of the Web?

Om Malik

I agree with you 100%. I think it is one of the more efficient ways of finding information. I am just wondering if this actually will replace the search-find-click paradigm we are used to now thanks to google.


“Wikipedia’s pages… are incredibly detailed and often include enough information that obviate the need to search any further.” More often, I find that the Wikipedia page provides a better set of links than a Google search.

Comments are closed.