Create your own spam filter extensions

Dec 12, 2007

My last post about comment spam fighting resulted in a lot of e-mails from readers asking how to create their own spam fighting logic in BlogEngine.NET 1.3. So I decided to show a simple extension that listens for certain bad words and filters on those. If a comment contains one of the predefined words it is considered spam.

The extension


[Extension("Filters comments containing bad words", "1.0", "Mads Kristensen")]

public class BadWordFilter

{

 

  // Constructor

  public BadWordFilter()

  {

    // Add the event handler for the CommentAdded event

    Post.AddingComment += new EventHandler<CancelEventArgs>(Post_AddingComment);

  }

 

  // The collection of bad words

  private static readonly StringCollection BAD_WORDS = AddBadWords();

 

  // Add bad words to the collection

  private static StringCollection AddBadWords()

  {

    StringCollection col = new StringCollection();

    col.Add("VIAGRA");

    col.Add("CASINO");

    col.Add("MORTAGE");

 

    return col;

  }

 

  // Handle the AddingComment event

  private void Post_AddingComment(object sender, CancelEventArgs e)

  {

    Comment comment = (Comment)sender;

    string body = comment.Content.ToUpperInvariant();

 

    // Search for bad words in the comment body

    foreach (string word in BAD_WORDS)

    {

      if (body.Contains(word))

      {

        // Cancel the comment and raise the SpamAttack event

        e.Cancel = true;

        Comment.OnSpamAttack();

        break;

      }

    }

  }

 

}

The problem with an extension that filters based on bad words is that if you have a blog about medicine then Viagra probably isn’t a bad word. Therefore this type of spam fighting is left out of the release, but is offered as a separate download where you are able to define your own bad words.

Download BadWordFilter.zip (743 bytes)

* $4.95/month BlogEngine.net Hosting – Click Here!

Comments (12) -

dJ phuturecybersonique
dJ phuturecybersonique Malaysia
12/12/2007 4:41:44 PM #

the problem with spam comments is that there may be variations to those bad words itself, i.e. using 1 in place of I, etc. etc. it can get unnecessarily cumbersome to maintain such lists over time.

just a thought, any plans on integrating akismet.net (www.codeplex.com/.../View.aspx) into BE as an extension, mads?

Mads Kristensen
Mads Kristensen Denmark
12/12/2007 4:48:55 PM #

I agree on it being cumbersome, but this was just an example on how to hook into the comment spam events to create a filter of your own.

Akismet could easily be made a plug-in, but I think I'll leave it to the community to write it. Then I'll make sure to link to it from the BlogEngine website.

Erboristeria
Erboristeria Italy
12/12/2007 5:05:51 PM #

Very interesting project i use wordpress, drupal i want try this system hurry to my next project.
Good job and thanks for share this product.

rtur.net
rtur.net United States
12/12/2007 6:33:06 PM #

Looks like a great candidate for extension manager: can be disabled by default and list maintained by blogger through admin tool. And I'm thinking about borrowing string collection concept - looks much cleaner that old-fashioned array of strings I normally use.

Justin Etheredge
Justin Etheredge United States
12/12/2007 7:55:45 PM #

In BlogEngine.net 1.2 what is the appropriate method for taking care of that comment, being that the Post.AddingComment event only uses EventArgs and not CancelEventArgs?

Mads Kristensen
Mads Kristensen Denmark
12/12/2007 7:58:25 PM #

@Justin,

Since you cannot cancel the event like in 1.3, you can always end the response by calling HttpContext.Current.Response.End() which will stop further processing and thereby preventing the comment from being added.

Adam
Adam United States
12/12/2007 8:35:59 PM #

I may be missing something, but it seems like this could block legitimate comments as well. What if someone in your comments references "All of that viagra spam", or comments that they cannot buy a new PC because of their high mortgage payments. Since it only takes one word, it seems like the danger of this would be high.

How would you warn users of this action? Tell them not to use the same words as spammers? How would they know what those words were?

Justin Etheredge
Justin Etheredge United States
12/12/2007 8:41:44 PM #

Here is that Akismet extension, give it a shot.

www.codethinked.com/.../...r-BlogEnginenet-12.aspx

Mads Kristensen
Mads Kristensen Denmark
12/12/2007 9:01:14 PM #

@Adam
It's just an example on how to use the BlogEngine API for spam fighting purposes. I'm not gonna use this example.

@Justin
You are fast Smile I've added a link from the BlogEngine website's extensions page at http://dotnetblogengine.net/page/extensions.aspx

Justin Etheredge
Justin Etheredge United States
12/12/2007 11:04:01 PM #

Thanks! And I love the Santa hat on your pics.

Cristiano
Cristiano Italy
12/14/2007 7:56:19 PM #

Long time ago, i have written a c# procedure to replace all bad word (like f**k and also) with a set of asterisks (****), waiting new release of deticated events to intercept, in dotnetblogengine.
Now it's time to rewrite this procedure on this new scenario.
Thanks a lot, Mads, for your excellent work ...

tanvon malik
tanvon malik Islamic Republic of Pakistan
1/27/2008 8:52:56 AM #

Hi
    great work, I just added the functionality in this extension of adding the words through extension page, so it is eady to add or remove words you wana make a comment as spam.
www.tanvon.com/.../

Pingbacks and trackbacks (1)+

Comments are closed

About the author

Mads Kristensen

Mads Kristensen
Program Manager at the Microsoft Web Platform team and founder of BlogEngine.NET.

More...

Month List

Disclaimer

The opinions expressed herein are my own personal opinions and do not represent my employer’s view in any way.