Comment Spammers 1, Me 0
DOWNTOWN LOS ANGELES — I don't like interrupting for technical stuff here, but I have to let you know that the comment spam that I'm sure you see over in the recent comments is about to drive me crazy. It's been especially steady over the past few days -- I've knocked off 600+ comments in the last three. I'm going to keep knocking it down as soon as I can see it, and I'm going to work on implementing some sort of technical means of fighting it, but in the mean time just bear with me.
If you care about the tech bit of things, I'll discuss that after the jump.
The spams are coming from a wide range of IP addresses, so I don't think any sort of IP filtering is going to cut it. I've thought about just doing simple wordlist censorship, but that feels like a very inelegant approach to me.
My current thought is some sort of a Bayesian filter that'll get injected into the comment system and score what status the message should get set with. I've been just giving comment spam a status of 0 over the past few days (invisible, as opposed to the normal 1). That means I have a base set of 200 good comments and 600 spam comments to train a filter with. Since I'm not running something off the shelf like MT or Wordpress there's no just plugging in a pre-built system, but my code's fairly straight-forward, so it shouldn't be that hard to hook into something and use it for scoring.
That's a task for tomorrow, though. Right now I'm off to enjoy my bed.
Comments
Ok here is what I think would be good...
Force registration. If they don't want to register then their comment wasn't that important anyway. It will cut down on the amount of comments that you get but hey oh well. You could also use the typepad registration that many people have.
Comments don't show up until they have been approved by you. You can have a whitelist of people who are known goods and who you don't have to approve.
make sure your comment links have the rel="nofollow" tag so that spammers get no google love from them.
start a posse to hunt down and kill spammers.
i haven't had a problem with comment spam, but have had a problem with trackback spam. what helped for me was implementing blackhole checks against the unconfirmed.dsbl.org and sbl-xbl.spamhaus.org blackhole lists using the posting host and hostnames of URLs in the entry, using the chongqed list, and using a small list of blacklisted words and phrases.
# on Aug.08.2005 AT 07:17 AMeecue: I hate registration, though. I won't leave comments on sites that require me to do a typepad registration. I hate to make people do things that I won't do myself. Setting comments to default to a zero status and then require moderation would be easy and effective, but it's work for me. I'd rather do work once and end up with a nice tech solution. I do need to write a nofollow plugin for comments.
jim: The problem is that the sites I see from last night, for instance, don't appear in any of the blacklists right now.
I'm interested to see what the Bayesian approach might look like. I think it could have some promise. More on that later. -e;
# on Aug.08.2005 AT 07:46 AMe: I have a solution for this. It's a bit of a kluge but the effort level is low which makes it appealing. If see you tonight we can talk or I'll eMail you John
# on Aug.09.2005 AT 04:46 PMBeautiful... I'm all for solutions. Tonight, though, I'm in Chicago, staying at the Doubletree thanks to United. Ironically, it's now pouring out and lightning, so if they had delayed my flight a bit instead of cancelling they maybe could have blamed the weather. -e;
# on Aug.09.2005 AT 08:40 PMWell Mr. Treasurer
Call me when you get in to town
See you soon
J
# on Aug.09.2005 AT 11:03 PMchanging the subject slightly...does anyone have an update on the status of completing the 6 mile gap of the Long Beach (710) freeway?
# on Aug.10.2005 AT 10:25 AMRonnie, look here to answer your question:
http://forum.skyscraperpage.com/showthread.php?s=73c9717201abe1bc9e20416d5860f0b5&threadid=84011
# on Aug.10.2005 AT 01:40 PM


City Clerk Releases...
Broadway Effort Would...
City Clerk Releases...
Should L.A. Emulate...
Should L.A. Emulate...
Council Vetoes Pershing...
Broadway Effort Would...
Broadway Effort Would...
Should L.A. Emulate...
Broadway Effort Would...