Data at the root level is invalid

by Mads Kristensen 17. April 2008 03:27

A few days ago I needed to write some functionality to fetch an XML document from a URL and load it into an XmlDocument. As always I use the WebClient to retrieve simple documents over HTTP and it looked like this:

using (WebClient client = new WebClient())

{

  string xml = client.DownloadString("http://example.com/doc.xml");

  XmlDocument doc = new XmlDocument();

  doc.LoadXml(xml);

}

I ran the function and got this very informative XmlException message: Data at the root level is invalid. Line 1, position 1. I’ve seen this error before so I knew immediately what the problem was. The XML document that was retrieved from the web had three strange characters in the very beginning of the document. It looks like this:

<?xml version="1.0" encoding="utf-8"?>

Of course that result in an invalid XML document and that’s why it threw the exception. The three characters are actually a hex value (0xEFBBBF) of the preample of the encoding used by the document.

As said, I knew this error and also an easy way around still using the WebClient. Instead of retrieving the document string from the URL and load it into the XmlDocument using its LoadXml method, the easiest way is to retrieve the response stream and use the Load method of the XmlDocument instead. It could look like this:

using (WebClient client = new WebClient())

using (Stream stream = client.OpenRead("http://example.com/doc.xml"))

{     

  XmlDocument doc = new XmlDocument();

  doc.Load(stream);

}

Often there are situations where the WebClient isn’t well suited for this or one might simply prefer to use the WebRequest and WebResponse classes. Still, the solution is very simple. Here is what it could look like:

WebRequest request = HttpWebRequest.Create("http://example.com/doc.xml");

using (WebResponse response = request.GetResponse())

using (Stream stream = response.GetResponseStream())

{

  XmlDocument doc = new XmlDocument();

  doc.Load(stream);

}

This is something that can give gray hairs if you haven’t run into it before, so I thought I’d share.  

If you have any issues with the three preample characters when serving - not consuming - XML documents, then check out Rick Strahl's very informative post about it.

* Only $4.95/month ASP.NET & Windows 2008 + IIS 7 Hosting! FREE SQL Included

Tags:

Server-side

Comments

4/17/2008 7:10:31 PM #

Taras

This error reminds me a similar trouble i've got when working with HtmlAgilityPack library (which has a basic wrapper over WebClient class + some rudimentary caching infrastructure).

Taras Ukraine |

4/17/2008 9:32:30 PM #

Dan

Those three little characters preceding xml data remind me of another three little characters I see: on the first entry in the CSS of my blog's theme... So- where do they come from?

Dan United States |

4/18/2008 12:19:04 AM #

wwfDev

Mads, why don't you mention Byte Order Mark (BOM) in your post - instead of calling it "the three characters" throughout your post? I mean, you obviously know what it is since you link to Rick's pages. This post would not come up in a BOM-google search (or maybe it will now that my comment mentions it Smile )and thats a pity, since it might be someone stumbeling upon a problem with BOM's and could have put your solution to it.

wwfDev |

4/18/2008 12:21:56 AM #

Mads Kristensen

@wwfDev,

That is actually on purpose exactly for SEO reasons. There are many articles and posts about BOM, but people will only search for BOM if they know what it is. It's more likely people that doesn't know what it is to search for "three strange characters".

Mads Kristensen Denmark |

4/18/2008 3:09:59 PM #

pingback

Pingback from blog.cwa.me.uk

Reflective Perspective - Chris Alcock  » The Morning Brew #76

blog.cwa.me.uk |

4/18/2008 11:05:28 PM #

pingback

Pingback from xmlpronews.com

XML Pro News  » Blog Archive   » XmlException: Data At The Root Level Is Invalid

xmlpronews.com |

4/22/2008 12:31:06 AM #

Wayne

Yup, no clue what BOM is...I would have type those 'three strange characters' as well.  At least now I know.  Smile

Wayne United States |

4/22/2008 12:49:33 AM #

Mike Hamilton

I ran into this when posting an XML file to a partners web service, and this article could have helped me then. Now that we have an internal blog (running some dotnetBlogEngine or something LOL), this is getting posted to it for future reference.

thanks

Mike Hamilton United States |

4/22/2008 3:45:37 AM #

Wayne

Sorry Mads, just testing my gravitar setup....loving BlogEngine!

Wayne United States |

4/22/2008 3:46:30 AM #

Wayne

Would be great if we could zap our comments...now I just feel foolish...

Wayne United States |

4/22/2008 3:48:18 AM #

Mads Kristensen

@Wayne,

Don't worry. With that clown nose it's too late anyway Smile

Mads Kristensen Denmark |

4/22/2008 4:14:11 PM #

Paulo Morgado

Why not just use an XmlReader?

Paulo Morgado Portugal |

5/9/2008 3:17:02 AM #

Michael

Hey Mads. I love your blog, but I have a pretty simple question for you about this post. Why don't you just use the "Load" method of the XmlDocument class to load the Xml file directly into the XmlDocument instance? The "Load" method has an overload that takes a string as the parameter, and if you place the full url for the Xml file into it, it will load the xml document, even though it's not a local Xml file. I'm just curious, 'cause I don't know if there's any advantages of using the "WebClient" or "WebRequest" class' like you do.

Michael Denmark |

1/19/2009 12:27:39 AM #

pingback

Pingback from keyongtech.com

Cannot replace double quotes | keyongtech

keyongtech.com |

Comments are closed

About the slave

Mads Kristensen Mads Kristensen
Web developer at ZYB and founder of BlogEngine.NET. More...

LinkedIn ZYB Facebook Last.fm Twitter View Mads Kristensen's profile on Technorati

The Lounge

Disclaimer

The opinions expressed herein are my own personal opinions and do not represent my employer's view in anyway.

© Copyright 2008