Saturday, August 28, 2010

The language hole in Blogger's comment spam filtering

This comment evaded Blogger's comment spam filters ....
На нашем Видео каталоге вы можете Посмотреть Видеосюжеты, захотите на Простой видео сайт
It contained an embedded link.

The language is Russian [1]. Google translate gives us ...
On our video directory, you can watch video projects, you want for a simple video site
Blogger needs to treat non-blog-language posts as spam, or at least require review with auto-translation to the native language. This one should have been caught.

If I get another one like this (presumably later today) I'll have to turn on authenticated commenting until Google catches up.

[1] Why would anyone bother to post spam in a language that cannot be read? Well, for one thing, Google's search engine can "read" the link, and so the source gets link kharma. It's also a way to find vulnerable blog targets to further exploit. I'm sure there are other benefits, such as very foolish people curiously clicking on the link; those people likely have vulnerable machines that are easy to pawn.

