Status 500 errors when doing HTTP requests

Aug 10, 2008

For some strange reason I couldn’t figure out why some websites would return status 500 internal server error when they were retrieved using a WebClient in C#. The same page would render fine using a browser. It was only once in a while it happened.

I thought it might have something to do with the WebClient class so I tried using an HttpWebRequest and HttpWebResponse instead, but the result was the same. Then I started Fiddler to construct requests and tried out different HTTP headers. This let me to the problem and the solution.

The problem was that some websites use certain headers without checking if they exist or not. In this case it was the Accept-Encoding and Accept-Language headers that were missing from my request. The solution is the method below.

/// <summary>

/// Downloads a web page from the Internet and returns the HTML as a string. .

/// </summary>

/// <param name="url">The URL to download from.</param>

/// <returns>The HTML or null if the URL isn't valid.</returns>

public static string DownloadWebPage(Uri url)

{

  try

  {

    HttpWebRequest request = (HttpWebRequest)WebRequest.Create(url);

    request.Headers["Accept-Encoding"] = "gzip";

    request.Headers["Accept-Language"] = "en-us";

    request.Credentials = CredentialCache.DefaultNetworkCredentials;

    request.AutomaticDecompression = DecompressionMethods.GZip;

 

    using (WebResponse response = request.GetResponse())

    {

      using (StreamReader reader = new StreamReader(response.GetResponseStream()))

      {

        return reader.ReadToEnd();

      }

    }

  }

  catch (Exception)

  {

    return null;

  }

}

This is one of those things that seem obvious when you know the way around it. It still didn't stop me from using an hour tracking it down. Doh!

* Only $4.95/month ASP.NET & Windows 2008 + IIS 7 Hosting! FREE SQL Included

Comments (5) -

Ruchit Surati
Ruchit Surati India
8/10/2008 10:21:44 AM #


Hi Mads,

That was very useful. On the server wide to restrict pages being accessed by HttpWebClient we practice putting authentication tickets in Http-Get/Post headers.


Thanks.


Ruchit S.

*********************************

Domenic Denicola
Domenic Denicola United States
8/10/2008 8:34:18 PM #

Doesn't setting the AutomaticDecompression property fill out your Accept-Encoding headers for you?

Mads Kristensen
Mads Kristensen Denmark
8/11/2008 10:17:31 AM #

@Domenic,

I thought so as well, but it actually won't work without the Accept-Encoding headers. It's probably an issue on the remote website and not in the HTTP request.

Chairs
Chairs
10/31/2008 12:03:55 PM #

Thanks... Going to use this on our new site for checking affiliate links Smile

offyourfeet
offyourfeet United Kingdom
1/13/2009 4:35:08 AM #

Hey thanks for that wonderful information....

Pingbacks and trackbacks (1)+

Comments are closed

About the author

Mads Kristensen

Mads Kristensen
Program Manager at the Microsoft Web Platform team and founder of BlogEngine.NET.

More...

Month List

Disclaimer

The opinions expressed herein are my own personal opinions and do not represent my employer’s view in any way.