0 Comments

The XHTML definition demands all tags to be lower-cased. Your page will not validate otherwise and will therefore not be valid XHTML. If you write all your XHTML by yourself, it shouldn’t be an issue. You simply write all tags in lower-case.

Now, imaging situations where you’re not in control over the code being written. One situation is when you let visitors/users of the website write HTML in a text box or even better, a rich text editor like FCKeditor or FreeTextBox. For some reason, no rich text editor I know of can write flawless XHTML in all situations, correct me if I’m wrong.

So, I wrote a little static helper method in C# that converts HTML tags to lower-case.

/// <summary>
/// Convert HTML tags from upper case to lower case. This is important in order
/// to make it XHTML compliant. It also includessome tags that are not
/// XHTML compliant, you can remove them if you want.
/// </summary>
private static string LowerCaseHtml(string html)
{
    string[] tags = new string[] {
    "p", "a", "br", "span", "div", "i", "u", "b", "h1", "h2",
    "h3", "h4", "h5", "h6", "h7", "ul", "ol", "li", "img",
    "tr", "table", "th", "td", "tbody", "thead", "tfoot",
    "input", "select", "option", "textarea", "em", "strong"
    };

    foreach (string s in tags)
    {
        html = html.Replace("<" + s.ToUpper(), "<" + s).Replace("/" + s.ToUpper() + ">", "/" + s + ">");;
    }

    return html;
}

If you also want to lower-case the HTML attributes, you can do it almost the same way as the HTML tags. I probably missed some attributes, but you can easily add them to the string array in the method below.

/// <summary>
/// Convert HTML attribues from upper case to lower case. This is important in order
/// to make it XHTML compliant.
/// </summary>
private static string LowerCaseAttributes(string html)
{
    string[] attributes = new string[] {
    "align", "cellspacing", "cellpadding", "valign", "border",
    "style", "alt", "title", "for", "col", "header", "clear",
    "colspan", "rows", "cols", "type", "name", "id", "target", "method"
    };

    foreach (string s in attributes)
    {
        html = html.Replace(s.ToUpper() + "=", s + "=");
    }

    return html;
}

You can use this method when you save the input from a text box or you can use it when you render the page. Here's how you change the output of the ASP.NET page by overriding the Render method. You can remove the tags you don't need from the method to optimize the performance.

protected override void Render(HtmlTextWriter writer)
{
    using (HtmlTextWriter htmlwriter = new HtmlTextWriter(new System.IO.StringWriter()))
    {
        base.Render(htmlwriter);
        writer.Write(LowerCaseHtml(htmlwriter.InnerWriter.ToString()));
    }
}

You can use this approach in conjunction with my whitespace removal method. It also uses the page's Render method.