Remove HTML comments at runtime

Dec 6, 2006

I’ve been playing a bit with regular expressions lately and have to find some useful tasks in order to practice the skill. So, today I wanted to make a little method that strips HTML comments from an ASP.NET webpage at runtime. The practical use of the exercise is somewhat limited for most developers, but some websites have so many comments that it might just save a decent amount of bytes from the response stream.

The problem with this exercise is that a lot of JavaScript is using HTML comments to hide it’s workings from older browsers. That would mean that those script tags would be empty. That’s why I made a rule saying that every JavaScript has to implement the HTML comments correctly. Some don’t so you have to change it your self.

This is how the JavaScript is wrongly commented which also breaks my regex.

<script type="text/javascript">

<!--

  function Name()

  {  

  }

-->

</script>

The commenting should look like this which is also the right way to do it.

<script type="text/javascript">

<!--//

  function Name()

  {  

  }

//-->

</script>

The regular expression is very simple and all you need to do is to add the following method to your webpage, user control or master page.

using System.IO;

using System.Text;

using System.Text.RegularExpressions;

 

private static Regex _Regex = new Regex("((<!-- )((?!<!-- ).)*( -->))(\\r\\n)*", RegexOptions.Singleline);

 

protected override void Render(HtmlTextWriter writer)

{

  using (StringWriter sw = new StringWriter())

  {

    using (HtmlTextWriter htmlWriter = new HtmlTextWriter(sw))

    {

      base.Render(htmlWriter);

      writer.Write(_Regex.Replace(sw.ToString(), string.Empty));

    }

  }
}

Maybe not the most useful stuff I've ever written, but fine for learning. The only thing that bugs me is the JavaScript rule.

>* $4.95/month BlogEngine.net Hosting – Click Here!

Comments (5) -

 Eber Irigoyen
Eber Irigoyen
12/7/2006 12:48:41 AM #

I think you could add a rule, still using regular expressions, to filter out those script tags (got an error when I entered the actual tags)

Mads Kristensen
Mads Kristensen
12/7/2006 7:27:45 AM #

I already tried a regex rule for the JavaScript, but couldn't make it work. That's what bugs me. If anyone is able to write the regex that embeds some sort of JavaScript rule, please let me know.

Ingmar Hoogendoorn
Ingmar Hoogendoorn
12/8/2006 7:14:44 AM #

Why not use server side comments in the HTML?

<%-- This is my comment --%>

See weblogs.asp.net/.../...ents-with-ASP.NET-2.0-.aspx

Mads Kristensen
Mads Kristensen
12/8/2006 7:35:18 AM #

Ingmar, you are right. This post was more of something I did to become better at regular expressions.

 Marcos
Marcos
12/14/2006 7:42:32 PM #

HI Mads

You can add

RegexOptions.Compiled | RegexOptions.Singleline

To the constructor of the RegEx this increase a lot the performance because it creates code at runtime to parse and replace the string

Keep in the good work
Cheers
Marcos

Comments are closed

About the author

Mads Kristensen

Mads Kristensen
Program Manager at the Microsoft Web Platform team and founder of BlogEngine.NET.

More...

Month List

Disclaimer

The opinions expressed herein are my own personal opinions and do not represent my employer’s view in any way.