4 Comments

Summary:

By now most of us are familiar with Google’s PageRank algorithm, or at least the principle behind it, whereby a web page is ranked based on who else is linking to it. One key aspect of blogs is that, while a few cover just about everything […]

By now most of us are familiar with Google’s PageRank algorithm, or at least the principle behind it, whereby a web page is ranked based on who else is linking to it. One key aspect of blogs is that, while a few cover just about everything under the sun, most blogs have specific areas of focus, be it art, news, politics or what have you. Such information is potentially valuable in the context of search because a blog can announce its areas of focus — keywords, in effect — that can be taken into account by search engines, which would then know what topics a specific site tends to cover.

Using existing meta tags within HTML, it would be pretty easy to create a de facto standard in which tags are used to place a blog, as well as individual posts within it, into categories or sets. For example, I used to publish a site, Telephony Design, that was specifically about telecom products and services, which would have been tagged with keywords like telecom, telecommunications, telephone systems, phone, etc.

Imagine if most blog- and site-hosting services asked you to self-describe your site with up to a few dozen keywords. Of course, you can already do this with tags, but it’s unclear to what extent search engines use this information.

In this scenario, when searching via a search engine that recognizes your tags, you could issue a query like “gore (climate)” to get search results that are optimized based on link weights from sites and blogs that describe themselves as climate-related. This isn’t the same as saying “gore AND climate,” because someone who blogs at a climate site might write something about Al Gore that’s not, strictly speaking, climate-related. Essentially this is a way of searching for a topic, as ranked by people (primarily bloggers) who write on several topics or areas of focus.

This isn’t a new idea, of course, since “meta keyword=” has been with us since the earliest days of the web. The trick is to create a subtle variation on search query syntax in which you’re asking to, in effect, “Find X within sites that are usually about Y.” It’s kind of poor man’s approach to the semantic web, but if enough sites and blogs used it, and popular search engines introduced a simple way to filter or weight search results based on it, the method should work.

An important point is that you’re not using the “meta” tag to emphasize a keyword, so your site isn’t more likely to show up if I do a search on “climate.” Instead, what the tag says is that you usually blog about “climate,” among other things. The actual keyword search is based on content elsewhere in the page, so the meta tag is just used in describing a limit set of keywords the content is usually about. Another important point is that if spiders only recognize a limited number of these tags, maybe 20 or so per domain, it will be difficult to spam search engines by stuffing hundreds of tags in a page header.

Is this a Google killer? Hardly. It seems like the kind of thing that could be added to existing search engines, Google included, pretty easily. This might seem like a trivial thing, but it should make search a lot smarter without burdening webmasters with the need to comply with an overly complex semantic web approach. This is also a simple and easily learned query style, so just as users have learned to combine keywords to improve search accuracy, they can use this approach to narrow search results by the type of source, in what amounts to a kind of fuzzy boolean search.

When it came to web services, REST won out over SOAP because of its simplicity. I think the same thing could happen here. After all, this is something even a novice web master could do in a minute — all that’s needed are a few lines of HTML.

  1. I thought spam was a major problem and the reason the major search engines all started ignoring such tags in the first place. Even if you limit the number of tags, a blog can still broadcast it is about something that it is not. It seems like a more effective approach, but one that involves more work for the search engine, is to extract that kind of information from the content of the pages it crawls. It could then deduce what a website’s overall category is, without a far reduced risk of gaming, SEO, etc.

    Share
  2. Cool idea, but I am not sure it will work. Most search engines (especially Google) don’t look at meta keywords as being important at all. In fact Google almost entirely ignores them. The reason is because they are too easy to use for spamming search results. Anytime you have information that is displayed to search engines and not users, it is open to abuse and search engine spam. Unfortunetly, because everyone in the world is not 100% honest I am not sure this will work that well.

    Share
  3. Hey Mike,

    Can you verify that Google doesn’t look at meta keywords? I’ve heard that meta tags are not effective, but I’ve also heard that nobody really knows for certain. So why not use them? If you have evidence that meta tags are an ineffective method of seo, then please let me know.

    Share
  4. Blogs are a better way of context related search, but theres a problem. How does a crawler come to know when a blogs is updated? What would be the frequency at which the crawler will have to go to the blog to get the latest information?? Will blog owners feel that crawlers are eating their bandwidth??

    Share

Comments have been disabled for this post