If you have email, you get unsolicited email messages called spam. What you might not have realized, however, is that web sites get spammed too, with bogus articles, comments, or discussion board entries that are intended to add links to another web site rather than further the discussion.
In the world of weblogs ("blogs," for short), spam is reaching epidemic proportion, and fighting blog spammers has become a serious effort for the online community. For some bloggers, it's a matter of survival, as they drown in the dozens or even hundreds of bogus comments added by software applications on a nightly basis.
The problem is that you don't want to block everyone from posting comments, because the dialogue that ensues from an interesting weblog posting makes blogs compelling reading. On my weblogs, I have comments added to articles that are months or even years old; other articles have garnered 50 or more comments.
It's the same dilemma as with email, isn't it? You want to filter out all the spam without accidentally blocking a legitimate message.
Rather than try to create blacklists, blockers, filters, or other mechanisms in weblog software, however, in January 2005 a group of bloggers proposed that the value of links from blogs to third-party sites be removed instead. Critically, this required the participation of the major search engines, and as of this point Google, Yahoo!, and MSN have all signed on and now support nofollow.
Why Do Spammers Want Links from Blogs?
If you're not a search engine maven, you might not be aware that one of the key criteria used when a search engine like Google decides which match to list as #1 versus #434 for a given search is how many sites point to your site. If a site shows up higher in the search results because it has more inbound links from other sites, it should be no surprise whatsoever that a major goal for a web site owner is therefore to create the maximum number of inbound links possible.
In the world of Google, the importance of your site is referred to as its PageRank, though it's actually more complex than that: PageRank actually refers to the popularity of your site, while search results show the pages on various sites that match a given search query.
Legitimate sites accomplish high PageRank by having great content, compelling writing, a witty or unusual perspective, or lots of friends. That's the promise of the vox populi foundation of the World Wide Web and modern search engine results. But if you're building a porn site, a gambling site, or something else that isn't likely to inspire people to link to your site, a tool that automatically adds links to your site from other sites by injecting bogus comments is going to be very interesting.
Like distributors of other unpopular materials, these spammers shrug when asked how they could do something so antisocial, and genuinely couldn't care less about the weblogs that they're polluting. It's all about them and their inbound links, and everything else is secondary to that magical #1 spot on Google.