Spam proof your website using an HttpModule

Oct 31, 2006

Every time an email address is written on a website, it allows spam robots to collect it and abuse it. If you have a website (e.g. blog or forum) that displays the users e-mail address it would be a nice service to mask it for the spam robots.

The safest way to display an e-mail address is to break it up and convert it to something like “name at company dot com”. However, there are a lot of problems involved with that approach. It is difficult to read and you can’t make it into a hyperlink like “mailto:name at company dot com”. If you want to make it into a hyperlink, the best way would be to use a JavaScript function similar to this:

function SendMail(name, company, domain)

{

  link = 'mai' + 'lto:' + name + '@' + company + '.' + domain;

  window.location.replace(link);

}

Then call that method with a hyperlink like this:

<a href="JavaScript:SendMail('name', 'company', 'domain');void(0)">name at company dot com</a>

That will make it pretty difficult to parse for a spam robot.

Another approach is to encode the characters into hex code which is perfectly readable for all browsers, but can proof to be more difficult to parse by robots but not impossible. What a robot can do is to just decode the entire HTML document from hex values into clear text, which will expose the e-mail addresses. But if we mixed clear text and hex values it will be much more difficult for the robot. That’s what the following HttpModule does.

HttpModule

The module replaces all e-mail addresses on your website with the mixed hex/clear text characters. It turns this

<a href="mailto:name@company.com">name@company.com</a>

Into this

<a href="&#109;ai&#108;to:&#110;am&#101;&#64;c&#111;&#109;p&#97;n&#121;&#46;c&#111;m">
n&#97;&#109;e&#64;&#99;&#111;m&#112;any.c&#111;&#109;</a>

It uses the System.Random class to do the mix of the clear text with the hex values. The primary methods in the modules are the ones that through regex, replaces the clear text addresses.

private static Regex _Regex = new Regex("(mailto:|)(\\w+[a-zA-Z0-9.-_]*)@(\\w+).(\\w+)");

private static Random _Random = new Random();

 

private static string EncodeEmails(string html)

{

  foreach (Match match in _Regex.Matches(html))

  {

    html = html.Replace(match.Value, Encode(match.Value));

  }

 

  return html;

}

 

private static string Encode(string value)

{

  StringBuilder sb = new StringBuilder();

  for (int i = 0; i < value.Length; i++)

  {

    if (_Random.Next(2) == 1)

      sb.AppendFormat("&#{0};", Convert.ToInt32(value[i]));

    else

      sb.Append(value[i]);

  }

 

  return sb.ToString();
}

Implementation

You can add this module to any existing web applications without breaking any code. Download the EmailSpamModule.cs below and place it in the App_Code folder on your website. Then add the following to the web.config:

<httpModules>

  <add type="EmailSpamModule" name="EmailSpamModule" />
</httpModules>

Even though the module makes it much more difficult to decode any e-mail address, it is still my advice that you use the JavaScript method if possible. If you're lazy or don't get paid by the hour, go for the module.

Download

EmailSpamModule.zip (1,16 KB)

* Only $4.95/month ASP.NET & Windows 2008 + IIS 7 Hosting! FREE SQL Included

Comments (4) -

 NinjaCross
NinjaCross
11/2/2006 3:08:43 PM #

Really nice implementation, thanks so much Smile

 Mads Kristensen
Mads Kristensen
11/2/2006 3:26:12 PM #

Thanks NinjaCross, i'm pleased to hear that.

 dan
dan
11/2/2006 4:29:38 PM #

yes u are right here with your article but some web site give u the option to choose if you want your email to be public or not. so in this way u cannot stop spammers.
   very interesting your article.

Chris Blankenship
Chris Blankenship United States
12/11/2007 3:30:01 AM #

This looks like an excellent addition to unable to get this to work on my site.  The screen just sits there waiting for something to happen.

Could my problems be related to using the IIS service included with Visual Stu 2005 instead of on an actual IIS Web Server?

Pingbacks and trackbacks (1)+

Comments are closed

About the author

Mads Kristensen

Mads Kristensen
Program Manager at the Microsoft Web Platform team and founder of BlogEngine.NET.

More...

Month List

Disclaimer

The opinions expressed herein are my own personal opinions and do not represent my employer’s view in any way.