Defensio Spam Filter: Follow-up
Posted by Jon Lee in Reviews, tags: comments, Defensio, spam-protection
This is a follow-up to a previous post where I took a sneak peak at the new Defensio spam filter which is still in beta. After using the filter for exactly 1 month, these are what the final statistics look like:
- Total number of comments: 8,675
This translates to roughly one comment every 5 minutes. - Number of legitimate comments: 520
This means only 6% of all comments were actual comments and not spam. - False negatives: 55
Of the 8,155 spam comments, 55 got through the filter. This means only 0.7% of spam messages were undetected. - False positives: 109
Of the 520 legimate comments, 109 were incorrectly flagged as spam. This means 21% of comments are considered spam.
At a glance, these figures seem fairly impressive except for the last one. Defensio was rather good at keeping spam out but it wasn’t too great at detecting legitimate comments. There are several reasons to explain this.
Short comments
On some of my posts like my MicroSD Giveaway, many of the comments were to enter the giveaway and didn’t contain much content. For example, a common comment was “I’m in”. A short “meaningless” comment is potentially flagged as a spam comment.
Early Bugs
Of the 109 messages marked as spam, I would say perhaps half of them came within the first week. After that initial week, Defensio seemed to get a better idea of what actually constituted a spam comment and thus improved. It is indeed learning and these past few days showed only one or two false positives.
The Best Feature
The best feature of Defensio is by far the Spaminess rating. Even if I have over 700 comments in the quarantine, I only need to look at the first few comments (those with the lowest spaminess rating) to be confident that there are no false positives in the bunch. Currently, it seems as if anything over 50% spamminess is put into quarantine and of all the false positives, all but one or two of them had a spamminess rating over 60%. I’d say the filter works pretty well!
Note: I’ve been without Internet access for the past few days and probably the next few days as well hence the lack of posts (I’m posting this from the public library).
Popularity: 1% [?]
Entries (RSS)