If You Hate Form Spam, This is the Site for You
This is a new project I’ve been gathering data on for over a year, and I’m going to launch it to the public for the first time. It’s called F.S. (or Form Spammer) Base, and it is an automated collection of all the form comment spam that I have received through contact forms over the past several months, based on a very simple, but surprisingly effective “algorithm” that I developed. As far as I know, there is only one other person who has even had this idea, and I’m not even sure if he has a site for the data…if he has, I haven’t seen it.
Please note that this database contains only form comments that I have received, and does not pertain to email, web forum, or other types of spam…yet (never say never, though).
I have over 2400 attempts logged in the database at the present time, and more are constantly added. As you can see, there are quite a few beauties in there.
I have a few questions I’d like to ask anyone still reading this blog in order to improve upon the concept:
1) What format, if any, would you like me to present this data in (besides the obvious HTML formatting already used) in order to import it into your own spam databases?
2) Is there a microformat that presently exists for this? (I could look, but it might save me some time if someone already knows). If not, I can create my own, no big deal.
3) My host, Sectorlink, has given me permission to use this data and present it openly as long as I don’t present any pornographic image data or directly link to any sites themselves, since the intent isn’t to openly encourage illicit behavior but rather to fight spam. I don’t think this is an issue, since I’m using Server.HTMLEncode on the spam attempt itself to make sure it shows up as code and not the output, but does anyone see anything that would be obviously pornographic? I’m not talking the text itself, but obvious links to porn sites and lewd images.
4) Does anyone see any false positives?
Any other thoughts would be appreciated.
