Regular Expression Matching End of String Position

Recently I had the need to validate that a given string contained only lower or upper case letters. In my first implementation I resorted to the following regular expression to do this validation: “^[a-zA-Z]+$”

I’m not a regular expressions advanced user but in this case I was pretty much convinced that it would work as expected. However when running the tests for that unit I got some unexpected results. The regular expression was allowing input of the form (“asd\n”) to pass the validation routine.

For this case I could just ignore the white-space by performing a String.Trim() on the input, but I got curious about the possibility to handle this situation using only a regular expression. With a bit of research I found out that instead of using the $ anchor I could replace it with the \z anchor that matches exactly the end of the string and not line breaks.

The following code illustrate this behavior.

var patterns = new string[] { "^[a-zA-Z]+$", @"\A[a-zA-Z]+\z" };

foreach (string pattern in patterns)
{
    Console.WriteLine("Pattern: {0}", pattern);
    Console.WriteLine();

    var inputs = new string[] { "abc", "abc\n" };

    foreach (string input in inputs)
    {
        Console.WriteLine("{0}: {1}", Regex.Escape(input), Regex.IsMatch(input, pattern));
    }

    Console.WriteLine();
    Console.WriteLine();
}
Advertisements

One thought on “Regular Expression Matching End of String Position”

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s