Validate a URL using regular expressions

Oct 24, 2006

Today, I had to build web form that took user input from standard ASP.NET input controls. In one of the text boxes the user must to enter a valid URL, so I had to make some validation logic. But first of all, I had to find out what kind of URL’s we would accept as being valid. These are the rules we decided upon:

  • The protocol must be http or https
  • Sub domains are allowed
  • Query strings are allowed

Based on those rules, I wrote this regular expression:

(http|https)://([\w-]+\.)+[\w-]+(/[\w- ./?%&=]*)?

It is used in a RegularExpressionValidator control on the web form and on a business object in C#.

<asp:RegularExpressionValidator runat="Server"

  ControlToValidate="txtUrl"

  ValidationExpression="(http|https)://([\w-]+\.)+[\w-]+(/[\w- ./?%&=]*)?"

  ErrorMessage="Please enter a valid URL"

  Display="Dynamic" />

Here is the server-side validator method used by the business object:

using System.Text.RegularExpressions;

 

private bool IsUrlValid(string url)

{

  return Regex.IsMatch(url, @"(http|https)://([\w-]+\.)+[\w-]+(/[\w- ./?%&=]*)?");

}

You can add more protocols to the expression easily. Just add them to the beginning of the regular expression:

(http|https|ftp|gopher)://([\w-]+\.)+[\w-]+(/[\w- ./?%&=]*)?

You can also allow every thinkable protocol containing at least 3 characters by doing this:

([a-zA-Z]{3,})://([\w-]+\.)+[\w-]+(/[\w- ./?%&=]*)?

* $4.95/month ASP.NET Hosting with FREE SQL 2012 DB! – Click Here!

Comments (3) -

 uday garikapati
uday garikapati
10/25/2006 2:04:37 PM #

Do you have such thing in Java coding for this RegEx?

 Damien Guard
Damien Guard
10/25/2006 2:11:40 PM #

There are a two limitations to the RegEx here that should be noted:
1. It won't allow non-standard ports http://www.mysite.com:8080
2. It won't allow username:passwords embedded e.g.  ftp://user:pass@mysite.com/

www.damieng.com/.../...nd_manipulation_in_NET.aspx has a URL decoding class with a more complex regex for handling such scenario if  required.

You won't be able to use mailto: with your RegEx (or mine) as it doesn't have the // bit and the rules are rather different.  I went with a secondary RegEx in my parsing class as it got too messy trying to merge them.

[)amien

 Paul Hayman
Paul Hayman
10/26/2006 9:20:45 AM #

It won't allow me to post the expression in this comment box so I posted URL's to my pages:

URL Reg Ex :
www.geekzilla.co.uk/...-4B4E-BFFD-E8088CBC85FD.htm

Also, Mail and News  Reg Ex:
www.geekzilla.co.uk/...-4A4C-A627-EB6EB5A3519C.htm

Comments are closed

About the author

Mads Kristensen

Mads Kristensen
Program Manager at the Microsoft Web Platform team and founder of BlogEngine.NET.

More...

Month List

Disclaimer

The opinions expressed herein are my own personal opinions and do not represent my employer’s view in any way.