Get the query string right

Jan 17, 2007

Today, I ran into a strange little issue regarding the System.Uri class that took me too long to figure out. The Uri class has a property called Query that, according to the documentation, returns everything from the "?" to the end of the URI.

Consider this unescaped URI:

string url = "http://www.google.com/search?q=httpcompression c# asp.net";

string query = new Uri(url).Query;

The Uri.Query on that URI returns q=httpcompression c. As you can see, it breaks when it encounters a "#" character, so it doesn’t return everything from the "?" to the end. If the URI was escaped or UrlEncoded, there would have been no problem because the "#" character would be encoded into "%23". The problem is, that in this particular case I had to deal with unescaped URIs. 

So to make the above code work properly, the simplest way was to do exactly what the documentation for the URI.Query says: Return everything from the question mark to the end of the URI. That lead to the following helper method.

string url = "http://www.google.com/search?q=httpcompression c# asp.net";

string query = ExtractQuery(url);

 

public static string ExtractQuery(string url)

{

  if (string.IsNullOrEmpty(url))

    return string.Empty;

 

  int index = url.IndexOf("?") + 1;

  if (index == 0)

    return string.Empty;

 

  return url.Substring(index);

}

>

>

It’s not bad or ugly in any way, but it is annoying and also frustrating that the documentation isn’t correct either.

* Only $4.95/month ASP.NET & Windows 2008 + IIS 7 Hosting! FREE SQL Included

Comments (4) -

 Eber Irigoyen
Eber Irigoyen
1/17/2007 4:19:06 PM #

I thought I would get an exception if I passed

http://www.google.com/search?

but it doesn't, so the only validation you need there is in the case there is no ? symbol

http://www.google.com/search

Mads Kristensen
Mads Kristensen
1/17/2007 4:35:58 PM #

It's fixed now. Thanks for the heads up. I did the check for the ? character a different place in the code, so I didn't write the method to do the check. But now it does. I've also made it public so it needed to check for nulls as well.

 Paul Wilson
Paul Wilson
1/17/2007 11:02:32 PM #

The existing Query property functions properly in my opinion.  Why, since it stops at the hash (#) symbol?  Because the hash symbol is a special character that indicates a fragment identifier in a URI.  In other words, according to the specs for URIs, the query string in your example is just "q=httpcompression c", and the fragment identifier is " asp.net".  What is a fragment identifier?  They typically refer to an "anchor" on a web page, which basically scrolls the page to the anchored location.  By the way, if you actually try your example, you will see that Google interprets it as I've indicated, meaning that it searches only on "httpcompression c".

 Paul Wilson
Paul Wilson
1/17/2007 11:06:24 PM #

Oh yea, one more thing, I believe the fragment identifier is not considered part of the URI itself.  That would mean that the definition you found and disliked is correct also, since the URI would not technically include "# asp.net".

Comments are closed

About the author

Mads Kristensen

Mads Kristensen
Program Manager at the Microsoft Web Platform team and founder of BlogEngine.NET.

More...

Month List

Disclaimer

The opinions expressed herein are my own personal opinions and do not represent my employer’s view in any way.