Trying to Validate URL Using JavaScript

By | December 28, 2017
Questions:

I want to validate a URL and display message. Below is my code:

$("#pageUrl").keydown(function(){
        $(".status").show();
        var url = $("#pageUrl").val();

        if(isValidURL(url)){

        $.ajax({
            type: "POST",
            url: "demo.php",
            data: "pageUrl="+ url,
            success: function(msg){
                if(msg == 1 ){
                    $(".status").html('<img src="images/success.gif"/><span><strong>SiteID:</strong>12345678901234456</span>');
                }else{
                    $(".status").html('<img src="images/failure.gif"/>');
                }
            }
            });

            }else{

                    $(".status").html('<img src="images/failure.gif"/>');
            }

    });


function isValidURL(url){
    var RegExp = /(ftp|http|https):\/\/(\w+:{0,1}\w*@)?(\S+)(:[0-9]+)?(\/|\/([\w#!:.?+=&%@!\-\/]))?/;

    if(RegExp.test(url)){
        return true;
    }else{
        return false;
    }
} 

My problem is now it will show an error message even when entering a proper URL until it matches regular expression, and it return true even if the URL is something like "http://wwww".

I appreciate your suggestions.

Answers:

Someone mentioned the Jquery Validation plugin, seems overkill if you just want to validate the url, here is the line of regex from the plugin:

return this.optional(element) || /^(https?|ftp):\/\/(((([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-f]{2})|[!\$&'\(\)\*\+,;=]|:)*@)?(((\d|[1-9]\d|1\d\d|2[0-4]\d|25[0-5])\.(\d|[1-9]\d|1\d\d|2[0-4]\d|25[0-5])\.(\d|[1-9]\d|1\d\d|2[0-4]\d|25[0-5])\.(\d|[1-9]\d|1\d\d|2[0-4]\d|25[0-5]))|((([a-z]|\d|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(([a-z]|\d|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])*([a-z]|\d|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])))\.)+(([a-z]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(([a-z]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])*([a-z]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])))\.?)(:\d*)?)(\/((([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-f]{2})|[!\$&'\(\)\*\+,;=]|:|@)+(\/(([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-f]{2})|[!\$&'\(\)\*\+,;=]|:|@)*)*)?)?(\?((([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-f]{2})|[!\$&'\(\)\*\+,;=]|:|@)|[\uE000-\uF8FF]|\/|\?)*)?(\#((([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-f]{2})|[!\$&'\(\)\*\+,;=]|:|@)|\/|\?)*)?$/i.test(value);

Here is where they got it from: http://projects.scottsplayground.com/iri/

Pointed out by @nhahtdh This has been updated to:

        // Copyright (c) 2010-2013 Diego Perini, MIT licensed
        // https://gist.github.com/dperini/729294
        // see also https://mathiasbynens.be/demo/url-regex
        // modified to allow protocol-relative URLs
        return this.optional( element ) || /^(?:(?:(?:https?|ftp):)?\/\/)(?:\S+(?::\S*)?@)?(?:(?!(?:10|127)(?:\.\d{1,3}){3})(?!(?:169\.254|192\.168)(?:\.\d{1,3}){2})(?!172\.(?:1[6-9]|2\d|3[0-1])(?:\.\d{1,3}){2})(?:[1-9]\d?|1\d\d|2[01]\d|22[0-3])(?:\.(?:1?\d{1,2}|2[0-4]\d|25[0-5])){2}(?:\.(?:[1-9]\d?|1\d\d|2[0-4]\d|25[0-4]))|(?:(?:[a-z\u00a1-\uffff0-9]-*)*[a-z\u00a1-\uffff0-9]+)(?:\.(?:[a-z\u00a1-\uffff0-9]-*)*[a-z\u00a1-\uffff0-9]+)*(?:\.(?:[a-z\u00a1-\uffff]{2,})).?)(?::\d{2,5})?(?:[/?#]\S*)?$/i.test( value );

source: https://github.com/jzaefferer/jquery-validation/blob/c1db10a34c0847c28a5bd30e3ee1117e137ca834/src/core.js#L1349

Questions:
Answers:

It’s not practical to parse URLs using regex. A full implementation of the RFC1738 rules would result in an enormously long regex (assuming it’s even possible). Certainly your current expression fails many valid URLs, and passes invalid ones.

Instead:

a. use a proper URL parser that actually follows the real rules. (I don’t know of one for JavaScript; it would probably be overkill. You could do it on the server side though). Or,

b. just trim away any leading or trailing spaces, then check it has one of your preferred schemes on the front (typically ‘http://’ or ‘https://’), and leave it at that. Or,

c. attempt to use the URL and see what lies at the end, for example by sending it am HTTP HEAD request from the server-side. If you get a 404 or connection error, it’s probably wrong.

it return true even if url is something like “http://wwww“.

Well, that is indeed a perfectly valid URL.

If you want to check whether a hostname such as ‘wwww’ actually exists, you have no choice but to look it up in the DNS. Again, this would be server-side code.

Questions:
Answers:
function validateURL(textval) {
    var urlregex = /^(https?|ftp):\/\/([a-zA-Z0-9.-]+(:[a-zA-Z0-9.&%$-]+)*@)*((25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]?)(\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])){3}|([a-zA-Z0-9-]+\.)*[a-zA-Z0-9-]+\.(com|edu|gov|int|mil|net|org|biz|arpa|info|name|pro|aero|coop|museum|[a-zA-Z]{2}))(:[0-9]+)*(\/($|[a-zA-Z0-9.,?'\\+&%$#=~_-]+))*$/;
    return urlregex.test(textval);
}

This can return true for URLs like:

http://stackoverflow.com/questions/1303872/url-validation-using-javascript

or:

http://regexlib.com/DisplayPatterns.aspx?cattabindex=1&categoryId=2

Questions:
Answers:

I written also a URL validation function base on rfc1738 and rfc3986 to check http and https urls. I try to hold this modular, so it can be better maintained and adapted to own requirements.

The RegExp in one line is show at end of this post.

The RegExp accept HTTP and HTTPS URLs with some international domain or IPv4 number. IPv6 is not supported yet.

window.isValidURL = (function() {// wrapped in self calling function to prevent global pollution

     //URL pattern based on rfc1738 and rfc3986
    var rg_pctEncoded = "%[0-9a-fA-F]{2}";
    var rg_protocol = "(http|https):\\/\\/";

    var rg_userinfo = "([a-zA-Z0-9$\\-_.+!*'(),;:&=]|" + rg_pctEncoded + ")+" + "@";

    var rg_decOctet = "(25[0-5]|2[0-4][0-9]|[0-1][0-9][0-9]|[1-9][0-9]|[0-9])"; // 0-255
    var rg_ipv4address = "(" + rg_decOctet + "(\\." + rg_decOctet + "){3}" + ")";
    var rg_hostname = "([a-zA-Z0-9\\-\\u00C0-\\u017F]+\\.)+([a-zA-Z]{2,})";
    var rg_port = "[0-9]+";

    var rg_hostport = "(" + rg_ipv4address + "|localhost|" + rg_hostname + ")(:" + rg_port + ")?";

    // chars sets
    // safe           = "$" | "-" | "_" | "." | "+"
    // extra          = "!" | "*" | "'" | "(" | ")" | ","
    // hsegment       = *[ alpha | digit | safe | extra | ";" | ":" | "@" | "&" | "=" | escape ]
    var rg_pchar = "a-zA-Z0-9$\\-_.+!*'(),;:@&=";
    var rg_segment = "([" + rg_pchar + "]|" + rg_pctEncoded + ")*";

    var rg_path = rg_segment + "(\\/" + rg_segment + ")*";
    var rg_query = "\\?" + "([" + rg_pchar + "/?]|" + rg_pctEncoded + ")*";
    var rg_fragment = "\\#" + "([" + rg_pchar + "/?]|" + rg_pctEncoded + ")*";

    var rgHttpUrl = new RegExp( 
        "^"
        + rg_protocol
        + "(" + rg_userinfo + ")?"
        + rg_hostport
        + "(\\/"
        + "(" + rg_path + ")?"
        + "(" + rg_query + ")?"
        + "(" + rg_fragment + ")?"
        + ")?"
        + "$"
    );

    // export public function
    return function (url) {
        if (rgHttpUrl.test(url)) {
            return true;
        } else {
            return false;
        }
    };
})();

RegExp in one line:

var rg = /^(http|https):\/\/(([a-zA-Z0-9$\-_.+!*'(),;:&=]|%[0-9a-fA-F]{2})+@)?(((25[0-5]|2[0-4][0-9]|[0-1][0-9][0-9]|[1-9][0-9]|[0-9])(\.(25[0-5]|2[0-4][0-9]|[0-1][0-9][0-9]|[1-9][0-9]|[0-9])){3})|localhost|([a-zA-Z0-9\-\u00C0-\u017F]+\.)+([a-zA-Z]{2,}))(:[0-9]+)?(\/(([a-zA-Z0-9$\-_.+!*'(),;:@&=]|%[0-9a-fA-F]{2})*(\/([a-zA-Z0-9$\-_.+!*'(),;:@&=]|%[0-9a-fA-F]{2})*)*)?(\?([a-zA-Z0-9$\-_.+!*'(),;:@&=\/?]|%[0-9a-fA-F]{2})*)?(\#([a-zA-Z0-9$\-_.+!*'(),;:@&=\/?]|%[0-9a-fA-F]{2})*)?)?$/;

Questions:
Answers:

In a similar situation I got away with this:

someUtils.validateURL = function(url) {
    var parser = document.createElement('a');
    try {
        parser.href = url;
        return !!parser.hostname;
    } catch (e) {
        return false;
    }
};

i.e. why invent the wheel if browsers can do it for you? But, of course, this will only work in the browser.

there are various parts of parsed URL exactly how browser would interpret it:

parser.protocol; // => "http:"
parser.hostname; // => "example.com"
parser.port;     // => "8080"
parser.pathname; // => "/path/"
parser.search;   // => "?search=test"
parser.hash;     // => "#hash"
parser.host;     // => "example.com:3000"

Using these you can improve your validating function depending on the requirements. The only drawback is that it will accept relative URLs and use current page server’s host and port. But you can use it for your advantage, by re-assembling the URL from parts and always passing it in full to your AJAX service.

What validateURL won’t accept is invalid URL, e.g. http:\:8883 will return false, but :1234 is valid and is interpreted as http://pagehost.example.com/:1234 i.e. as a relative path.

UPDATE

This approach is no longer working with Chrome and other WebKit browsers. Even when URL is invalid, hostname is filled with some value, e.g. taken from base. It still helps to parse parts of URL, but will not allow to validate one.

Possible better no-own-parser approach is to use var parsedURL = new URL(url) and catch exceptions. See e.g. URL API. Supported by all major browsers and NodeJS, although still marked experimental.

Questions:
Answers:

best regex I found from http://angularjs.org/

var urlregex = /^(ftp|http|https):\/\/(\w+:{0,1}\w*@)?(\S+)(:[0-9]+)?(\/|\/([\w#!:.?+=&%@!\-\/]))?$/;

Questions:
Answers:

I know it’s quite an old question but since it does not have any accepted answer, I suggest you to use the URI.js framework: https://github.com/medialize/URI.js

You can use it to check for malformed URI using a try/catch block:

function isValidURL(url)
{
    try {
        (new URI(url));
        return true;
    }
    catch (e) {
        // Malformed URI
        return false;
    }
}

Of course it will consider something like “%@” as a well formed relative URI… So I suggest you read the URI.js API to perform more checks, for example if you want to make sure that the user entered a well formed absolute URL you may do like this:

function isValidURL(url)
{
    try {
        var uri = new URI(url);
        // URI has a scheme and a host
        return (!!uri.scheme() && !!uri.host());
    }
    catch (e) {
        // Malformed URI
        return false;
    }
}

Questions:
Answers:

This is what worked for me:

function validateURL(value) {
    return /^(https?|ftp):\/\/(((([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-f]{2})|[!\$&'\(\)\*\+,;=]|:)*@)?(((\d|[1-9]\d|1\d\d|2[0-4]\d|25[0-5])\.(\d|[1-9]\d|1\d\d|2[0-4]\d|25[0-5])\.(\d|[1-9]\d|1\d\d|2[0-4]\d|25[0-5])\.(\d|[1-9]\d|1\d\d|2[0-4]\d|25[0-5]))|((([a-z]|\d|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(([a-z]|\d|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])*([a-z]|\d|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])))\.)+(([a-z]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(([a-z]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])*([a-z]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])))\.?)(:\d*)?)(\/((([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-f]{2})|[!\$&'\(\)\*\+,;=]|:|@)+(\/(([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-f]{2})|[!\$&'\(\)\*\+,;=]|:|@)*)*)?)?(\?((([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-f]{2})|[!\$&'\(\)\*\+,;=]|:|@)|[\uE000-\uF8FF]|\/|\?)*)?(\#((([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-f]{2})|[!\$&'\(\)\*\+,;=]|:|@)|\/|\?)*)?$/i.test(value);
    }

from there is is just a matter of calling the function to get a true or false back:

validateURL(urltovalidate);

Questions:
Answers:

Import in an npm package like

https://www.npmjs.com/package/valid-url

and use it to validate your url.

Questions:
Answers:

If you’re looking for a more reliable regex, check out RegexLib. Here’s the page you’d probably be interested in:

http://regexlib.com/Search.aspx?k=url

As for the error messages showing while the person is still typing, change the event from keydown to blur and then it will only check once the person moves to the next element.

Questions:
Answers:
var RegExp = (/^HTTP|HTTP|http(s)?:\/\/(www\.)?[A-Za-z0-9]+([\-\.]{1}[A-Za-z0-9]+)*\.[A-Za-z]{2,40}(:[0-9]{1,40})?(\/.*)?$/);

Questions:
Answers:

My solution:

function isValidUrl(t)
{
    return t.match(/^(http|https|ftp):\/\/(([A-Z0-9][A-Z0-9_-]*)(\.[A-Z0-9][A-Z0-9_-]*)+)(:(\d+))?\/?/i)
}

Questions:
Answers:

You can use the URL API that is recently standard. Browser support is sketchy at best, see the link. new URL(str) is guaranteed to throw TypeError for invalid URLs.

As stated above, http://wwww is a valid URL.

Questions:
Answers:

Demo : http://jsbin.com/uzimeb/1/edit

function checkURL(value) {
    var urlregex = new RegExp("^(http|https|ftp)\://([a-zA-Z0-9\.\-]+(\:[a-zA-Z0-9\.&amp;%\$\-]+)*@)*((25[0-5]|2[0-4][0-9]|[0-1]{1}[0-9]{2}|[1-9]{1}[0-9]{1}|[1-9])\.(25[0-5]|2[0-4][0-9]|[0-1]{1}[0-9]{2}|[1-9]{1}[0-9]{1}|[1-9]|0)\.(25[0-5]|2[0-4][0-9]|[0-1]{1}[0-9]{2}|[1-9]{1}[0-9]{1}|[1-9]|0)\.(25[0-5]|2[0-4][0-9]|[0-1]{1}[0-9]{2}|[1-9]{1}[0-9]{1}|[0-9])|([a-zA-Z0-9\-]+\.)*[a-zA-Z0-9\-]+\.(com|edu|gov|int|mil|net|org|biz|arpa|info|name|pro|aero|coop|museum|[a-zA-Z]{2}))(\:[0-9]+)*(/($|[a-zA-Z0-9\.\,\?\'\\\+&amp;%\$#\=~_\-]+))*$");
    if (urlregex.test(value)) {
        return (true);
    }
    return (false);
}

Questions:
Answers:

I have found a great resource for comparing different solutions:
https://mathiasbynens.be/demo/url-regex

According to that page, only solution from diegoperini passes all tests. Here is that regex:

_^(?:(?:https?|ftp)://)(?:\S+(?::\S*)?@)?(?:(?!10(?:\.\d{1,3}){3})(?!127(?:\.\d{1,3}){3})(?!169\.254(?:\.\d{1,3}){2})(?!192\.168(?:\.\d{1,3}){2})(?!172\.(?:1[6-9]|2\d|3[0-1])(?:\.\d{1,3}){2})(?:[1-9]\d?|1\d\d|2[01]\d|22[0-3])(?:\.(?:1?\d{1,2}|2[0-4]\d|25[0-5])){2}(?:\.(?:[1-9]\d?|1\d\d|2[0-4]\d|25[0-4]))|(?:(?:[a-z\x{00a1}-\x{ffff}0-9]+-?)*[a-z\x{00a1}-\x{ffff}0-9]+)(?:\.(?:[a-z\x{00a1}-\x{ffff}0-9]+-?)*[a-z\x{00a1}-\x{ffff}0-9]+)*(?:\.(?:[a-z\x{00a1}-\x{ffff}]{2,})))(?::\d{2,5})?(?:/[^\s]*)?$_iuS

Questions:
Answers:

I checked a lot of url validators in google and no one works for me. For example I’d like to see valid on links like ‘aa.com’. I like silly check for dot sign in string.

function isValidUri(str) {
  var dotIndex = str.indexOf('.');
  return (dotIndex > 0 && dotIndex < str.length - 2);
}

It should not stay on beginning and end of string (for now we don’t have top level domain names with one character).

Questions:
Answers:

Try edit your isValidURL function as follows:

function isValidURL(url) {
    var encodedURL = encodeURIComponent(url);
    var isValid = false;

    $.ajax({
      url: "http://query.yahooapis.com/v1/public/yql?q=select%20*%20from%20html%20where%20url%3D%22" + encodedURL + "%22&format=json",
      type: "get",
      async: false,
      dataType: "json",
      success: function(data) {
        isValid = data.query.results != null;
      },
      error: function(){
        isValid = false;
      }
    });

    return isValid;
}

This should do the trick.

Questions:
Answers:

Here’s a regular expression which might fit the bill (it’s very long):

/^(?:\u0066\u0069\u006C\u0065\u003A\u002F{2}(?:\u002F{2}(?:(?:[\u0041-\u005A\u0061-\u007A\u0030-\u0039\u002D\u002E\u005F\u007E\u0021\u0024\u0026\u0027\u0028\u0029\u002A\u002B\u002C\u003B\u003D]|\u0025[\u0030-\u0039\u0041-\u0046\u0061-\u0066][\u0030-\u0039\u0041-\u0046\u0061-\u0066])*\u0040)?(?:\u005B(?:(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A){6}(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}|(?:(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035])\u002E){3}(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035]))|\u003A{2}(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A){5}(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}|(?:(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035])\u002E){3}(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035]))|(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4})?\u003A{2}(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A){4}(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}|(?:(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035])\u002E){3}(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035]))|(?:(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A)?[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4})?\u003A{2}(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A){3}(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}|(?:(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035])\u002E){3}(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035]))|(?:(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A){0,2}[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4})?\u003A{2}(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A){2}(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}|(?:(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035])\u002E){3}(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035]))|(?:(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A){0,3}[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4})?\u003A{2}[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}|(?:(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035])\u002E){3}(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035]))|(?:(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A){0,4}[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4})?\u003A{2}(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}|(?:(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035])\u002E){3}(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035]))|(?:(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A){0,5}[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4})?\u003A{2}[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}|(?:(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A){0,6}[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4})?\u003A{2})\u005D|(?:(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035])\u002E){3}(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035])|(?:[\u0041-\u005A\u0061-\u007A\u0030-\u0039](?:(?:[\u0041-\u005A\u0061-\u007A\u0030-\u0039\u002D]+)?[\u0041-\u005A\u0061-\u007A\u0030-\u0039])?|(?:[\u0041-\u005A\u0061-\u007A\u0030-\u0039](?:(?:[\u0041-\u005A\u0061-\u007A\u0030-\u0039\u002D]+)?[\u0041-\u005A\u0061-\u007A\u0030-\u0039])?\u002E)+[\u0041-\u005A\u0061-\u007A\u0030-\u0039](?:(?:[\u0041-\u005A\u0061-\u007A\u0030-\u0039\u002D]+)?[\u0041-\u005A\u0061-\u007A\u0030-\u0039])?))(?:\u003A(?:\u0030-\u0035\u0030-\u0039{0,4}|\u0036\u0030-\u0034\u0030-\u0039{3}|\u0036\u0035\u0030-\u0034\u0030-\u0039{2}|\u0036\u0035\u0035\u0030-\u0032\u0030-\u0039|\u0036\u0035\u0035\u0033\u0030-\u0035))?(?:\u002F(?:[\u0041-\u005A\u0061-\u007A\u0030-\u0039\u002D\u002E\u005F\u007E\u0021\u0024\u0026\u0027\u0028\u0029\u002A\u002B\u002C\u003B\u003D\u003A\u0040]|\u0025[\u0030-\u0039\u0041-\u0046\u0061-\u0066][\u0030-\u0039\u0041-\u0046\u0061-\u0066])*)*|\u002F(?:(?:[\u0041-\u005A\u0061-\u007A\u0030-\u0039\u002D\u002E\u005F\u007E\u0021\u0024\u0026\u0027\u0028\u0029\u002A\u002B\u002C\u003B\u003D\u003A\u0040]|\u0025[\u0030-\u0039\u0041-\u0046\u0061-\u0066][\u0030-\u0039\u0041-\u0046\u0061-\u0066])+(?:\u002F(?:[\u0041-\u005A\u0061-\u007A\u0030-\u0039\u002D\u002E\u005F\u007E\u0021\u0024\u0026\u0027\u0028\u0029\u002A\u002B\u002C\u003B\u003D\u003A\u0040]|\u0025[\u0030-\u0039\u0041-\u0046\u0061-\u0066][\u0030-\u0039\u0041-\u0046\u0061-\u0066])*)*)?|(?:[\u0041-\u005A\u0061-\u007A\u0030-\u0039\u002D\u002E\u005F\u007E\u0021\u0024\u0026\u0027\u0028\u0029\u002A\u002B\u002C\u003B\u003D\u003A\u0040]|\u0025[\u0030-\u0039\u0041-\u0046\u0061-\u0066][\u0030-\u0039\u0041-\u0046\u0061-\u0066])+(?:\u002F(?:[\u0041-\u005A\u0061-\u007A\u0030-\u0039\u002D\u002E\u005F\u007E\u0021\u0024\u0026\u0027\u0028\u0029\u002A\u002B\u002C\u003B\u003D\u003A\u0040]|\u0025[\u0030-\u0039\u0041-\u0046\u0061-\u0066][\u0030-\u0039\u0041-\u0046\u0061-\u0066])*)*)|[\u0041-\u005A\u0061-\u007A][\u0041-\u005A\u0061-\u007A\u0030-\u0039\u002B\u002D\u002E]*\u003A(?:\u002F{2}(?:(?:[\u0041-\u005A\u0061-\u007A\u0030-\u0039\u002D\u002E\u005F\u007E\u0021\u0024\u0026\u0027\u0028\u0029\u002A\u002B\u002C\u003B\u003D]|\u0025[\u0030-\u0039\u0041-\u0046\u0061-\u0066][\u0030-\u0039\u0041-\u0046\u0061-\u0066])*\u0040)?(?:\u005B(?:(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A){6}(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}|(?:(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035])\u002E){3}(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035]))|\u003A{2}(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A){5}(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}|(?:(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035])\u002E){3}(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035]))|(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4})?\u003A{2}(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A){4}(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}|(?:(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035])\u002E){3}(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035]))|(?:(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A)?[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4})?\u003A{2}(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A){3}(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}|(?:(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035])\u002E){3}(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035]))|(?:(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A){0,2}[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4})?\u003A{2}(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A){2}(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}|(?:(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035])\u002E){3}(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035]))|(?:(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A){0,3}[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4})?\u003A{2}[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}|(?:(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035])\u002E){3}(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035]))|(?:(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A){0,4}[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4})?\u003A{2}(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}|(?:(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035])\u002E){3}(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035]))|(?:(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A){0,5}[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4})?\u003A{2}[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}|(?:(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A){0,6}[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4})?\u003A{2})\u005D|(?:(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035])\u002E){3}(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035])|(?:[\u0041-\u005A\u0061-\u007A\u0030-\u0039](?:(?:[\u0041-\u005A\u0061-\u007A\u0030-\u0039\u002D]+)?[\u0041-\u005A\u0061-\u007A\u0030-\u0039])?|(?:[\u0041-\u005A\u0061-\u007A\u0030-\u0039](?:(?:[\u0041-\u005A\u0061-\u007A\u0030-\u0039\u002D]+)?[\u0041-\u005A\u0061-\u007A\u0030-\u0039])?\u002E)+[\u0041-\u005A\u0061-\u007A\u0030-\u0039](?:(?:[\u0041-\u005A\u0061-\u007A\u0030-\u0039\u002D]+)?[\u0041-\u005A\u0061-\u007A\u0030-\u0039])?))(?:\u003A(?:\u0030-\u0035\u0030-\u0039{0,4}|\u0036\u0030-\u0034\u0030-\u0039{3}|\u0036\u0035\u0030-\u0034\u0030-\u0039{2}|\u0036\u0035\u0035\u0030-\u0032\u0030-\u0039|\u0036\u0035\u0035\u0033\u0030-\u0035))?(?:\u002F(?:[\u0041-\u005A\u0061-\u007A\u0030-\u0039\u002D\u002E\u005F\u007E\u0021\u0024\u0026\u0027\u0028\u0029\u002A\u002B\u002C\u003B\u003D\u003A\u0040]|\u0025[\u0030-\u0039\u0041-\u0046\u0061-\u0066][\u0030-\u0039\u0041-\u0046\u0061-\u0066])*)*|\u002F(?:(?:[\u0041-\u005A\u0061-\u007A\u0030-\u0039\u002D\u002E\u005F\u007E\u0021\u0024\u0026\u0027\u0028\u0029\u002A\u002B\u002C\u003B\u003D\u003A\u0040]|\u0025[\u0030-\u0039\u0041-\u0046\u0061-\u0066][\u0030-\u0039\u0041-\u0046\u0061-\u0066])+(?:\u002F(?:[\u0041-\u005A\u0061-\u007A\u0030-\u0039\u002D\u002E\u005F\u007E\u0021\u0024\u0026\u0027\u0028\u0029\u002A\u002B\u002C\u003B\u003D\u003A\u0040]|\u0025[\u0030-\u0039\u0041-\u0046\u0061-\u0066][\u0030-\u0039\u0041-\u0046\u0061-\u0066])*)*)?|(?:[\u0041-\u005A\u0061-\u007A\u0030-\u0039\u002D\u002E\u005F\u007E\u0021\u0024\u0026\u0027\u0028\u0029\u002A\u002B\u002C\u003B\u003D\u003A\u0040]|\u0025[\u0030-\u0039\u0041-\u0046\u0061-\u0066][\u0030-\u0039\u0041-\u0046\u0061-\u0066])+(?:\u002F(?:[\u0041-\u005A\u0061-\u007A\u0030-\u0039\u002D\u002E\u005F\u007E\u0021\u0024\u0026\u0027\u0028\u0029\u002A\u002B\u002C\u003B\u003D\u003A\u0040]|\u0025[\u0030-\u0039\u0041-\u0046\u0061-\u0066][\u0030-\u0039\u0041-\u0046\u0061-\u0066])*)*)(?:\u003F(?:[\u0041-\u005A\u0061-\u007A\u0030-\u0039\u002D\u002E\u005F\u007E\u0021\u0024\u0026\u0027\u0028\u0029\u002A\u002B\u002C\u003B\u003D\u003A\u0040\u002F\u003F]|\u0025[\u0030-\u0039\u0041-\u0046\u0061-\u0066][\u0030-\u0039\u0041-\u0046\u0061-\u0066])*)?(?:\u0023(?:[\u0041-\u005A\u0061-\u007A\u0030-\u0039\u002D\u002E\u005F\u007E\u0021\u0024\u0026\u0027\u0028\u0029\u002A\u002B\u002C\u003B\u003D\u003A\u0040\u002F\u003F]|\u0025[\u0030-\u0039\u0041-\u0046\u0061-\u0066][\u0030-\u0039\u0041-\u0046\u0061-\u0066])*)?)$/

There are some caveats to its usage, namely it does not validate URIs which contain additional information after the user name (e.g. “username:password”). Also, only IPv6 addresses can be contained within the IP literal syntax and the “IPvFuture” syntax is currently ignored and will not validate against this regular expression. Port numbers are also constrained to be between 0 and 65,535. Also, only the file scheme can use triple slashes (e.g. “file:///etc/sysconfig”) and can ignore both the query and fragment parts of a URI. Finally, it is geared towards regular URIs and not IRIs, hence the extensive focus on the ASCII character set.

This regular expression could be expanded upon, but it’s already complex and long enough as it is. I also cannot guarantee it’s going to be “100% accurate” or “bug free”, but it should correctly validate URIs for all schemes.

You will need to do additional verification for any scheme-specific requirements or do URI normalization as this regular expression will validate a very broad range of URIs.

Leave a Reply

Your email address will not be published. Required fields are marked *