0 Comments

A few days ago I needed to write some functionality to fetch an XML document from a URL and load it into an XmlDocument. As always I use the WebClient to retrieve simple documents over HTTP and it looked like this:

using (WebClient client = new WebClient())

{

  string xml = client.DownloadString("http://example.com/doc.xml");

  XmlDocument doc = new XmlDocument();

  doc.LoadXml(xml);

}

I ran the function and got this very informative XmlException message: Data at the root level is invalid. Line 1, position 1. I’ve seen this error before so I knew immediately what the problem was. The XML document that was retrieved from the web had three strange characters in the very beginning of the document. It looks like this:

<?xml version="1.0" encoding="utf-8"?>

Of course that result in an invalid XML document and that’s why it threw the exception. The three characters are actually a hex value (0xEFBBBF) of the preample of the encoding used by the document.

As said, I knew this error and also an easy way around still using the WebClient. Instead of retrieving the document string from the URL and load it into the XmlDocument using its LoadXml method, the easiest way is to retrieve the response stream and use the Load method of the XmlDocument instead. It could look like this:

using (WebClient client = new WebClient())

using (Stream stream = client.OpenRead("http://example.com/doc.xml"))

{     

  XmlDocument doc = new XmlDocument();

  doc.Load(stream);

}

Often there are situations where the WebClient isn’t well suited for this or one might simply prefer to use the WebRequest and WebResponse classes. Still, the solution is very simple. Here is what it could look like:

WebRequest request = HttpWebRequest.Create("http://example.com/doc.xml");

using (WebResponse response = request.GetResponse())

using (Stream stream = response.GetResponseStream())

{

  XmlDocument doc = new XmlDocument();

  doc.Load(stream);

}

This is something that can give gray hairs if you haven’t run into it before, so I thought I’d share.  

If you have any issues with the three preample characters when serving - not consuming - XML documents, then check out Rick Strahl's very informative post about it.