Create your own spam filter extensions

by Mads Kristensen 13. December 2007 00:06

My last post about comment spam fighting resulted in a lot of e-mails from readers asking how to create their own spam fighting logic in BlogEngine.NET 1.3. So I decided to show a simple extension that listens for certain bad words and filters on those. If a comment contains one of the predefined words it is considered spam.

The extension


[Extension("Filters comments containing bad words", "1.0", "Mads Kristensen")]

public class BadWordFilter

{

 

  // Constructor

  public BadWordFilter()

  {

    // Add the event handler for the CommentAdded event

    Post.AddingComment += new EventHandler<CancelEventArgs>(Post_AddingComment);

  }

 

  // The collection of bad words

  private static readonly StringCollection BAD_WORDS = AddBadWords();

 

  // Add bad words to the collection

  private static StringCollection AddBadWords()

  {

    StringCollection col = new StringCollection();

    col.Add("VIAGRA");

    col.Add("CASINO");

    col.Add("MORTAGE");

 

    return col;

  }

 

  // Handle the AddingComment event

  private void Post_AddingComment(object sender, CancelEventArgs e)

  {

    Comment comment = (Comment)sender;

    string body = comment.Content.ToUpperInvariant();

 

    // Search for bad words in the comment body

    foreach (string word in BAD_WORDS)

    {

      if (body.Contains(word))

      {

        // Cancel the comment and raise the SpamAttack event

        e.Cancel = true;

        Comment.OnSpamAttack();

        break;

      }

    }

  }

 

}

The problem with an extension that filters based on bad words is that if you have a blog about medicine then Viagra probably isn’t a bad word. Therefore this type of spam fighting is left out of the release, but is offered as a separate download where you are able to define your own bad words.

Download BadWordFilter.zip (743 bytes)

* Only $4.95/month ASP.NET & Windows 2008 + IIS 7 Hosting! FREE SQL Included

Tags: , ,

BlogEngine | Security

Comments

12/13/2007 1:41:44 AM #

dJ phuturecybersonique

the problem with spam comments is that there may be variations to those bad words itself, i.e. using 1 in place of I, etc. etc. it can get unnecessarily cumbersome to maintain such lists over time.

just a thought, any plans on integrating akismet.net (www.codeplex.com/.../View.aspx into BE as an extension, mads?

dJ phuturecybersonique Malaysia |

12/13/2007 1:48:55 AM #

Mads Kristensen

I agree on it being cumbersome, but this was just an example on how to hook into the comment spam events to create a filter of your own.

Akismet could easily be made a plug-in, but I think I'll leave it to the community to write it. Then I'll make sure to link to it from the BlogEngine website.

Mads Kristensen Denmark |

12/13/2007 2:05:51 AM #

Erboristeria

Very interesting project i use wordpress, drupal i want try this system hurry to my next project.
Good job and thanks for share this product.

Erboristeria Italy |

12/13/2007 3:33:06 AM #

rtur.net

Looks like a great candidate for extension manager: can be disabled by default and list maintained by blogger through admin tool. And I'm thinking about borrowing string collection concept - looks much cleaner that old-fashioned array of strings I normally use.

rtur.net United States |

12/13/2007 4:55:45 AM #

Justin Etheredge

In BlogEngine.net 1.2 what is the appropriate method for taking care of that comment, being that the Post.AddingComment event only uses EventArgs and not CancelEventArgs?

Justin Etheredge United States |

12/13/2007 4:58:25 AM #

Mads Kristensen

@Justin,

Since you cannot cancel the event like in 1.3, you can always end the response by calling HttpContext.Current.Response.End() which will stop further processing and thereby preventing the comment from being added.

Mads Kristensen Denmark |

12/13/2007 5:35:59 AM #

Adam

I may be missing something, but it seems like this could block legitimate comments as well. What if someone in your comments references "All of that viagra spam", or comments that they cannot buy a new PC because of their high mortgage payments. Since it only takes one word, it seems like the danger of this would be high.

How would you warn users of this action? Tell them not to use the same words as spammers? How would they know what those words were?

Adam United States |

12/13/2007 5:41:44 AM #

Justin Etheredge

Here is that Akismet extension, give it a shot.

www.codethinked.com/.../...r-BlogEnginenet-12.aspx

Justin Etheredge United States |

12/13/2007 6:01:14 AM #

Mads Kristensen

@Adam
It's just an example on how to use the BlogEngine API for spam fighting purposes. I'm not gonna use this example.

@Justin
You are fast Smile I've added a link from the BlogEngine website's extensions page at http://dotnetblogengine.net/page/extensions.aspx

Mads Kristensen Denmark |

12/13/2007 8:04:01 AM #

Justin Etheredge

Thanks! And I love the Santa hat on your pics.

Justin Etheredge United States |

12/15/2007 4:56:19 AM #

Cristiano

Long time ago, i have written a c# procedure to replace all bad word (like f**k and also) with a set of asterisks (****), waiting new release of deticated events to intercept, in dotnetblogengine.
Now it's time to rewrite this procedure on this new scenario.
Thanks a lot, Mads, for your excellent work ...

Cristiano Italy |

1/27/2008 5:46:38 PM #

pingback

Pingback from tanvon.com

tanvon my e-life | BlogEngine.NET extension SpamFighter

tanvon.com |

1/27/2008 5:52:56 PM #

tanvon malik

Hi
    great work, I just added the functionality in this extension of adding the words through extension page, so it is eady to add or remove words you wana make a comment as spam.
www.tanvon.com/.../

tanvon malik Islamic Republic of Pakistan |

Comments are closed

About the slave

Mads Kristensen Mads Kristensen
Web developer at ZYB and founder of BlogEngine.NET. More...

LinkedIn ZYB Facebook Last.fm Twitter View Mads Kristensen's profile on Technorati

The Lounge

Disclaimer

The opinions expressed herein are my own personal opinions and do not represent my employer's view in anyway.

© Copyright 2008