Back in the 1990s, searching for something on the web was like looking for a needle in a haystack. Then Google (GOOG) came along, and it was as if someone had handed us a magnet with which finding the good (and the relevant) became downright easy. […]

needle_haystack_2.gifBack in the 1990s, searching for something on the web was like looking for a needle in a haystack. Then Google (GOOG) came along, and it was as if someone had handed us a magnet with which finding the good (and the relevant) became downright easy.

The post-bust social media boom, however, is bringing an end to the good times. The explosion of content on the Internet is making searching for information difficult once again. It is one of the reason we are seeing a sharp increase in the number of companies — vertical search engines and wiki-based collaborations, for example — that want to help us find the information for which we’re looking. This trend – lets call it smart content aggregation – is something I wrote about back in March for Business 2.0.

These companies are trying to insert themselves into the proverbial three-page paradigm that has been popularized by Google: page one is the search box; page two, the search results page; and finally, page three, the final destination that holds the coveted information. It’s the “second page” that’s getting crowded, and it’s forcing average web surfers to look for simpler options.

One option is Jason Calacanis’ Mahalo, which uses wiki-software to create specialized topic pages, aggregated by editors, that quickly point searchers to the best web resources available. (Of course, from an advertiser perspective these topics are also amongst the most sought-after “keywords”) He isn’t alone in trying to insert his company in between the searcher and the original content source; Yahoo (YHOO) has been experimenting with aggregation mashups as well.

Earlier this week, Kosmix, a Mountain View, Calif.-based startup that is taking an algorithmic approach to aggregating content, launched two beta versions of specialized topic-based pages, RightAutos and RightTrips; it also formally launched its health-focused property, RightHealth. The company builds specialized start pages on topics within topics, (such as osteoarthritis in its health section), that include everything from videos to news to special reports, all supported by advertising. Kosmix claims its beta version of RightHealth currently gets more than 2.5 million visits and generates 9 million searches a month.

Wikia, another company that is aggregating content, is now doing about 250 million page views. Wikipedia, the biggest such operation (and a not-for-profit org) had become the 9th-largest site on the web in terms of unique visitors, according to comScore.

These traffic trends reflect a desire on the part of web surfers to find information smartly aggregated for them. But those surfers still want to start with Google. Hitwise tells us that in August, 47.2 percent of Wikipedia’s traffic came from Google, up 8 percent over the same month last year. Meanwhile, Mahalo, which launched in May, saw 53.3 percent of its traffic come from Google in August, a 49-percent jump from July. For the same month-to-month timeframe, Google sent over 8.37 percent of Wikia’s traffic.

While the jury is out on the success of the aforementioned startups, the need for smart aggregators is only going to increase as more content starts to come online. As I wrote in the March issue of Business 2.0:

Hyperaggregation is simply a way to do in the new-media world what old media has done for centuries: neatly package information. The value of a newspaper, after all, is not the information inside as much as the carefully considered layout of the front page. At a glance you can see what’s important. Smart new companies are finally figuring out how to do this online, where there’s too much content and not enough packaging.

The aggregation is going to pose a challenge for some of the traditional content sources. A lack of finely tuned information sources is one of the reasons content publishers get “wasted” clicks. The aggregators can take away some of that sloth by making seek-and-search more efficient. Wikipedia’s pages, for example, are incredibly detailed and often include enough information that obviate the need to search any further. That should be a scarier prospect than Google adding news wires to Google News and subsequently taking a bite out of newspapers’ online traffic.

You’re subscribed! If you like, you can update your settings

  1. “Wikipedia’s pages… are incredibly detailed and often include enough information that obviate the need to search any further.” More often, I find that the Wikipedia page provides a better set of links than a Google search.

  2. I agree with you 100%. I think it is one of the more efficient ways of finding information. I am just wondering if this actually will replace the search-find-click paradigm we are used to now thanks to google.

  3. I say it won’t (replace Google). The reason Wikipedia is so useful is because it’s a central place for “everything”, just like Google is, and the only way they are able to handle is with heavy user input.

    If you try aggregate content on your own (even with a huge staff), you end up with an About.com clone. Then if you have various dedicated portals for niche topics, how is that different from the current Internet?

    We already have quality sites offering niche content, and those who plan to stay in the longer run better adapt for increased user interaction and input. Isn’t that what we are looking at and still heading towards with the next generation of the Web?

  4. Hello,
    Have been a regular reader of this blog. Wikipedia has become a great site to access “information” rather than the “search-find-click paradigm” as you mentioned. We are a three people startup based in Pune, India. We have been working on an “information-engine” which delivers “information” to the end users, rather than links of web-pages. Currently Wikipedia is manually driven, we have created a system which automates and rapidly accelerates the creation of information. To give an analogy, Wikipedia is like the early automotive industry, the assembly line was manually operated. Our system is the equivalent of a modern automated assembly line. The role of humans changes to setting parameters and inspecting results.

    We have also used interesting tools to present this information to the end users. We just had an internal alpha release and would love to share some search results and further details with you.


  5. Second Brain – Organize Everything in Your Personal Internet Library Friday, September 14, 2007

    The time for smart aggregators

    We think Google and the rest of the search crowd will be the most important traffic source for Second Brain. We are in the business of aggregating our users internet content into a personal library – a true vertical integrator for the individual. Makin…

  6. surya narayana saripalli Friday, September 14, 2007

    Will there be digital divide in providing separate servers for WEB-2.0,and web-3.0 Technologies.
    The answer is pending as some more navigational[air][personal in transit in traffic],and personalized web pages are going to jam the Net.

  7. See – no one believed me when I said Google broke the internet.

  8. Working for vertical, B2B, search / directory site I know that there are times where our site will have information available about a company that doesn’t even have a website. The role of specialized or vertical search sites will continue to grow as 2.0 content and apps take root.

  9. It depends on what stage of Google you’re talking about. To begin with it was certainly a revolution in terms of search. However, as people learned to manipulate it, and it seems to focus more heavily on the advertising that keeps it in business, the quality of its results have gone down.

  10. Oh boy.
    All these people throwing around terms without a clear definition.
    human brain != boolean computer
    human information = data in context
    human context = related (data,events,knowledge, emotions …)
    [ very simplified, but I don't want to break it down to Consciousness]

    In other words if you don’t know what I’m working on, you provide data to me. Which I put into context, thereby creating information.

    Now let’s plan a business trip to Mountain View.
    I create a meeting in my Calendar with a location in Mountain View. The system creates a Workspace on my Computer under which all data for this meeting will be aggregated and preserved from now on. If I get flight confirmation the system checks the date and relates it to the meeting and ….
    In other words if I go on my system to a specific Workspace all information for this context are there. Now if I share this with a back end server wouldn’t the search be much more accurate?

    Come on now, Microsoft could build this in 2 years if they would use their dead brains. Google will struggle, since they have no local context. But they are working on it. And the rest …?

    Ok I will get a coffee now, getting far to agitated .

Comments have been disabled for this post