Remove whitespace from your pages

by Mads Kristensen 22. October 2007 03:13

Ok, this is not new. I’ve also written about this a few times in the past. The thing is that removing whitespace is a very tricky discipline that is different from site to site. At least that was what I thought until very recently.

For some unexplained reason I started working on a little simple method to remove whitespace in a way so it works on all websites without breaking any HTML. Maybe not unexplained since I’ve written about it so many times that it would seem I got a secret obsession.

Obsession or not, here is the code I ended up with after a few hours of hacking. Just copy the code onto your base page or master page and watch the magic.

private static readonly Regex REGEX_BETWEEN_TAGS = new Regex(@">\s+<", RegexOptions.Compiled);
private static readonly Regex REGEX_LINE_BREAKS = new Regex(@"\n\s+", RegexOptions.Compiled);
 
/// <summary>
/// Initializes the <see cref="T:System.Web.UI.HtmlTextWriter"></see> object and calls on the child
/// controls of the <see cref="T:System.Web.UI.Page"></see> to render.
/// </summary>
/// <param name="writer">The <see cref="T:System.Web.UI.HtmlTextWriter"></see> that receives the page content.</param>
protected override void Render(HtmlTextWriter writer)
{
  using (HtmlTextWriter htmlwriter = new HtmlTextWriter(new System.IO.StringWriter()))
  {
    base.Render(htmlwriter);
    string html = htmlwriter.InnerWriter.ToString();
 
    html = REGEX_BETWEEN_TAGS.Replace(html, "> <");
    html = REGEX_LINE_BREAKS.Replace(html, string.Empty);
 
    writer.Write(html.Trim());
  }
}

Remember that whitespace removal speeds up rendering in especially IE and reduces the overall weight of your page.

* Only $4.95/month ASP.NET & Windows 2008 + IIS 7 Hosting! FREE SQL Included

Tags: , ,

ASP.NET

Comments

10/22/2007 8:20:02 AM #

Miron Abramson

Hi Mads,
Good for us that you have such obsessions   Wink
Thanks for sharing.
Why don't you put this code in one of the modules that already in the site?

Miron

Miron Abramson Israel |

10/22/2007 9:50:27 AM #

Mark kemper

I concer with Miron

Mark kemper Australia |

10/22/2007 3:13:05 PM #

Fredrik

I like the idea, but would love to see some stats on this - how does the increased rendering time server-side compare to the time saved client-side?

Fredrik Norway |

10/22/2007 4:07:43 PM #

NinjaCross

Thanks for sharing Mads, I was waiting for a 360° solution like this  Smile

NinjaCross Italy |

10/22/2007 4:25:39 PM #

Michel

I use this sometimes, but I think you need to be careful about spaces inside TEXTAREA.

Michel France |

10/22/2007 4:36:56 PM #

Mads Kristensen

@Michel, this technique does not change anything inside a TEXTAREA. I've also had that problem before so this version will not break things like that.

Mads Kristensen Denmark |

10/22/2007 6:58:59 PM #

spybot

Im using:

System.IO.StringWriter stringWriter = new System.IO.StringWriter();
System.Web.UI.HtmlTextWriter htmlWriter = new System.Web.UI.HtmlTextWriter(stringWriter);
base.Render(htmlWriter);
System.Text.StringBuilder htmlData = new System.Text.StringBuilder(stringWriter.ToString());

.. move postbeck controls downpage

//remove whitespace
html.Replace("  ", String.Empty);
html.Replace("\t", String.Empty);
html.Replace("\r\n", String.Empty);

writer.Write(htmlData.ToString());

..

Do you thing REGEX is faster like stringreplace?

spybot Czech Republic |

10/23/2007 2:09:17 AM #

Brian

@spybot

I think Regex is faster.

Brian United States |

10/23/2007 4:06:14 AM #

michael

Having issues with AJAX postback. For now I put in a condition to not do anything for Request["HTTP_X_MICROSOFTAJAX"] == null and seems to work. Might be nice to do the same for AJAX returns though - prolly some regex magic needed.

michael United States |

10/23/2007 4:07:31 PM #

Dactivo

This piece of code isn't working for me in textareas where you include more than one line break:

private static readonly Regex REGEX_LINE_BREAKS = new Regex(@"\n\s+", RegexOptions.Compiled);

Right now i am using only the first regex, but it would be great to have a piece of code that solves this, without affecting the content in textareas.

Dactivo Spain |

10/29/2007 10:09:10 PM #

pingback

Pingback from mhinze.com

21 Links Today (2007-10-29)

mhinze.com |

11/16/2007 9:02:37 AM #

huobazi

but when my page any  contain javascript comment
such as

<script type="javascript">
// here is  a line comment.
var myComment = "a line comment";
alert(myComment);
</script>

when remove the "\n"

it was changed to

<script type="javascript">// here is  a line comment. var myComment = "a line comment"; alert(myComment);</script>

so... javascript error.

how ???

huobazi People's Republic of China |

3/24/2008 8:14:05 PM #

sharona

Converted to VB.net and works perfect! Thank you!!

sharona Dominican Republic |

4/4/2008 4:34:34 AM #

alex

Hi,

I really would like to use this, but i have one problem and i think you are the guru so i will explain you:

I need to remove tabs, whitespaces, and line breaks but only for the text outside the tags <report></report> (this is non standard i think, but we use it on a specific platform to generate reports)

So the content:


< p align = "center" > Hello              World!!< /p >
Follows is the report :
<report>
Hello everybody the balance is         : 250.00 usd

You can use it until tomorrow

        Regards        
</report>


This should finish like this:

<p align="center">Hello World!!</p>Follows is the report:<report>
Hello everybody the balance is         : 250.00 usd

You can use it until tomorrow

        Regards        </report>


I hope you can answer me Mads or someone other with expertise.

Thank you very much!

alex Mexico |

4/24/2008 4:40:16 AM #

pingback

Pingback from pimp.webdevelopernews.com

Removing Whitespace From Your Pages With ASP.NET

pimp.webdevelopernews.com |

7/17/2008 7:12:34 PM #

Alex

The idea is good and worth trying. But check twice before going live. This method ruins your ajax and javascript if you use any.

Alex Belarus |

9/14/2008 12:41:16 PM #

trackback

ASP.NET MVC: Remove Page Whitespace

I just spent a whole 4 hours trying to figure out the best way to get my whitespace off my rendered HTML

Zack Owens |

9/24/2008 9:44:06 PM #

Nino

Have you been able to get this to work with AJAX? I dosen't work for postbacks

Nino Canada |

10/27/2008 11:21:02 PM #

Al

I'll second Nino's comment: has anyone gotten a whitespace removal solution to work with ASP.NET AJAX postbacks?

Yes, there are ways to make whitespace removal work for the standard requests and be disabled for AJAX requests, but I've been unable to find a solution that will successfully trim the whitespace in AJAX postbacks.

Al United States |

2/9/2009 7:50:20 AM #

Barry Jones

Thanks for the great article, it has been really useful in the sites that I have developed.

I struggled on the asp.net AJAX post-back for a while until I looked at the source code for the page and noticed that the function __doPostBack was surrounded by JavaScript comments //.

//<![CDATA[
var theForm = document.forms['aspnetForm'];
if (!theForm) {
    theForm = document.aspnetForm;
}
function __doPostBack(eventTarget, eventArgument) {
    if (!theForm.onsubmit || (theForm.onsubmit() != false)) {
        theForm.__EVENTTARGET.value = eventTarget;
        theForm.__EVENTARGUMENT.value = eventArgument;
        theForm.submit();
    }
}
//]]>


After the white space compression has removed tabs, newlines etc the JavaScript above appears on one line. This makes the whole thing appear as a comment and therefore the reason you get problems with post backs in .Net when using AJAX.


The solution I wrote for my sites is:

    html = REGEX_BETWEEN_TAGS.Replace(html, "> <");
    html = REGEX_LINE_BREAKS.Replace(html, string.Empty);
    html = html.Replace("\r", "");
    html = html.Replace("//<![CDATA[", "");
    html = html.Replace("//]]>", "");
    html = html.Replace("\n", "");


Barry Jones United States |

3/27/2010 4:20:36 PM #

trackback

Nutzlosen Whitespace zur Laufzeit aus dem HTML entfernen

Nutzlosen Whitespace zur Laufzeit aus dem HTML entfernen

klaus_b@.NET |

Comments are closed

About the slave

Mads Kristensen Mads Kristensen
Web developer at ZYB and founder of BlogEngine.NET. More...

LinkedIn ZYB Facebook Last.fm Twitter View Mads Kristensen's profile on Technorati

The Lounge

Disclaimer

The opinions expressed herein are my own personal opinions and do not represent my employer's view in anyway.

© Copyright 2008