22 Comments

Ok, this is not new. I’ve also written about this a few times in the past. The thing is that removing whitespace is a very tricky discipline that is different from site to site. At least that was what I thought until very recently.

For some unexplained reason I started working on a little simple method to remove whitespace in a way so it works on all websites without breaking any HTML. Maybe not unexplained since I’ve written about it so many times that it would seem I got a secret obsession.

Obsession or not, here is the code I ended up with after a few hours of hacking. Just copy the code onto your base page or master page and watch the magic.

[code:c#]

private static readonly Regex REGEX_BETWEEN_TAGS = new Regex(@">\s+<", RegexOptions.Compiled);
private static readonly Regex REGEX_LINE_BREAKS = new Regex(@"\n\s+", RegexOptions.Compiled);
 
/// <summary>
/// Initializes the <see cref="T:System.Web.UI.HtmlTextWriter"></see> object and calls on the child
/// controls of the <see cref="T:System.Web.UI.Page"></see> to render.
/// </summary>
/// <param name="writer">The <see cref="T:System.Web.UI.HtmlTextWriter"></see> that receives the page content.</param>
protected override void Render(HtmlTextWriter writer)
{
  using (HtmlTextWriter htmlwriter = new HtmlTextWriter(new System.IO.StringWriter()))
  {
    base.Render(htmlwriter);
    string html = htmlwriter.InnerWriter.ToString();
 
    html = REGEX_BETWEEN_TAGS.Replace(html, "> <");
    html = REGEX_LINE_BREAKS.Replace(html, string.Empty);
 
    writer.Write(html.Trim());
  }
}

[/code]

Remember that whitespace removal speeds up rendering in especially IE and reduces the overall weight of your page.

Comments

Comment by Miron Abramson

Hi Mads,
Good for us that you have such obsessions ;-)
Thanks for sharing.
Why don't you put this code in one of the modules that already in the site?

Miron

Comment by Fredrik

I like the idea, but would love to see some stats on this - how does the increased rendering time server-side compare to the time saved client-side?

Comment by spybot

Im using:

System.IO.StringWriter stringWriter = new System.IO.StringWriter();
System.Web.UI.HtmlTextWriter htmlWriter = new System.Web.UI.HtmlTextWriter(stringWriter);
base.Render(htmlWriter);
System.Text.StringBuilder htmlData = new System.Text.StringBuilder(stringWriter.ToString());

.. move postbeck controls downpage

//remove whitespace
html.Replace(" ", String.Empty);
html.Replace("\t", String.Empty);
html.Replace("\r\n", String.Empty);

writer.Write(htmlData.ToString());

..

Do you thing REGEX is faster like stringreplace?

Comment by michael

Having issues with AJAX postback. For now I put in a condition to not do anything for Request["HTTP_X_MICROSOFTAJAX"] == null and seems to work. Might be nice to do the same for AJAX returns though - prolly some regex magic needed.

michael

Comment by Dactivo

This piece of code isn't working for me in textareas where you include more than one line break:

private static readonly Regex REGEX_LINE_BREAKS = new Regex(@"\n\s+", RegexOptions.Compiled);

Right now i am using only the first regex, but it would be great to have a piece of code that solves this, without affecting the content in textareas.

Comment by huobazi

but when my page any contain javascript comment
such as

<script type="javascript">
// here is a line comment.
var myComment = "a line comment";
alert(myComment);
</script>

when remove the "\n"

it was changed to

<script type="javascript">// here is a line comment. var myComment = "a line comment"; alert(myComment);</script>

so... javascript error.

how ???

Comment by alex

Hi,

I really would like to use this, but i have one problem and i think you are the guru so i will explain you:

I need to remove tabs, whitespaces, and line breaks but only for the text outside the tags <report></report> (this is non standard i think, but we use it on a specific platform to generate reports)

So the content:

[quote]
< p align = "center" > Hello World!!< /p >
Follows is the report :
<report>
Hello everybody the balance is : 250.00 usd

You can use it until tomorrow

Regards
</report>[/quote]

This should finish like this:
[quote]
<p align="center">Hello World!!</p>Follows is the report:<report>
Hello everybody the balance is : 250.00 usd

You can use it until tomorrow

Regards </report>
[/quote]

I hope you can answer me Mads or someone other with expertise.

Thank you very much!

alex

Comment by Alex

The idea is good and worth trying. But check twice before going live. This method ruins your ajax and javascript if you use any.

Alex

Comment by Al

I'll second Nino's comment: has anyone gotten a whitespace removal solution to work with ASP.NET AJAX postbacks?

Yes, there are ways to make whitespace removal work for the standard requests and be disabled for AJAX requests, but I've been unable to find a solution that will successfully trim the whitespace in AJAX postbacks.

Al

Comment by Barry Jones

Thanks for the great article, it has been really useful in the sites that I have developed.

I struggled on the asp.net AJAX post-back for a while until I looked at the source code for the page and noticed that the function [quote]__doPostBack[/quote] was surrounded by JavaScript comments //.

[quote]//<![CDATA[
var theForm = document.forms['aspnetForm'];
if (!theForm) {
theForm = document.aspnetForm;
}
function __doPostBack(eventTarget, eventArgument) {
if (!theForm.onsubmit || (theForm.onsubmit() != false)) {
theForm.__EVENTTARGET.value = eventTarget;
theForm.__EVENTARGUMENT.value = eventArgument;
theForm.submit();
}
}
//]]>
[/quote]

After the white space compression has removed tabs, newlines etc the JavaScript above appears on one line. This makes the whole thing appear as a comment and therefore the reason you get problems with post backs in .Net when using AJAX.


The solution I wrote for my sites is:

[quote] html = REGEX_BETWEEN_TAGS.Replace(html, "> <");
html = REGEX_LINE_BREAKS.Replace(html, string.Empty);
html = html.Replace("\r", "");
html = html.Replace("//<![CDATA[", "");
html = html.Replace("//]]>", "");
html = html.Replace("\n", "");[/quote]