Filter Any Way You Like

Today I’m going to take a quick look at filtering with the LIKE clause in SQL Server.  Unlike the OFFSET-FETCH clause that I talked about last week and is a new addition to SQL Server 2012, the LIKE clause has been available for quite awhile.  I use this clause when I need to filter the results of a SELECT statement but either do not know the exact expression value to search or because I want to include a subset of similar records based on the pattern of characters in the field value I am filtering the SELECT statement on.

One of the most common ways to use Like occurs when I know the field value begins with a specific letter or perhaps a set of letters, but I do not know the exact spelling.  In this case, if I try to enter the exact spelling and get it wrong, I would get either the wrong records in my result set or perhaps no records at all. In either case, I may want to filter on just the portion of the field’s value that I know.  For example, if I am looking for a customer whose last name begins with ‘Hel’ but I do not know the rest of the spelling, I could use the following SELECT statement to find all the customers matching that criteria.

SELECT LastName, FirstName, PhoneNumber FROM Customer WHERE LastName LIKE ‘Hel%’

Notice that the WHERE clause in this expression does not use a equal, greater than or less than sign.  Instead it uses the word LIKE followed by a string.  (Yes, LIKE only works with string data.)  The string itself includes the first three letters that I know the last name begins with, but then ends with a ‘%’ sign.  The percent sign tells SQL to match any records where the last name begins with Hel no matter what characters, even no characters that may follow.

Interestingly, I can also use the percent sign to match characters at the start of the string such as the following statement which looks for all customers that have a last name ending with ‘vich’:

SELECT LastName, FirstName, PhoneNumber FROM Customer WHERE LastName LIKE ‘%vich’

Finally, I can use the percent to match any characters before and after a set of known characters as in:

SELECT LastName, FirstName, PhoneNumber FROM Customer WHERE LastName LIKE ‘%ker%’

While the percent sign is a great placeholder for an arbitrary number of characters, what if I know the entire string except for a single character.  For example, suppose I don’t know if the last name I am looking for is ‘Bronson’ or ‘Bronsen’.  I could use a single underscore character for the character I don’t know in the following statement.

SELECT LastName, FirstName, PhoneNumber FROM Customer WHERE LastName LIKE ‘Brons_n’

I can use the underscore multiple times in the matching string.  I might even use an expression with three underscores to look for last names consisting of 3 characters.

SELECT LastName, FirstName, PhoneNumber FROM Customer WHERE LastName LIKE ‘___’

While these wild card character work great when I don’t know what characters the field value should have, I could be a little more specific.  For example, with the Bronson vs. Bronsen example I know the sixth character is either an ‘o’ or an ‘e’.  Rather than allow any character in this position, I could tell SQL to only return values where the sixth character was one of these two characters as in the following:

SELECT LastName, FirstName, PhoneNumber FROM Customer WHERE LastName LIKE ‘Brons[eo]n’

In this case the square brackets will match any values in which the sixth character of the field is either an ‘e’ or an ‘o’.  In fact, I could any number of characters within these square brackets.  However, if I have more than 3 or 4 characters that follow sequentially, I may want to use the shortcut of defining the range of characters as in the following statement which looks only for names that begin with one of the letters ‘A’ through ‘F’.

SELECT LastName, FirstName, PhoneNumber FROM Customer WHERE LastName LIKE ‘[A-F]%’

Notice that this expression tells SQL that the first letter must be one of the letters in the range ‘A’ through ‘F’, but there can be any combination of characters in the rest of the LastName.  So yes, it is possible to combine different wildcards in the same expression.

Sometimes it is easier to say what something is not rather than what something is.  In these case, you can use the ‘~’ character with either of the square bracket forms as in the following.

SELECT LastName, FirstName, PhoneNumber FROM Customer WHERE LastName LIKE ‘~[S]%’
SELECT LastName, FirstName, PhoneNumber FROM Customer WHERE LastName LIKE ‘~[A-F]%’

The first select above looks for all customers whose last name does not begin with the letter ‘S’.  In the second case, the statement looks for last names that do not begin with one of the letters in the range ‘A’ through ‘F’

One last point to keep in mind, using Like can affect the performance of your queries even if as in the above case with Customer I have an index on the customer’s last name.  If the filter expression begins with a few known characters, SQL may still decide to use the index.  However, if the expression begins with either the ‘%’ or the ‘_’ wildcard values, the filter has no way to narrow the selection of the records using an index.

Hope this helps you like your WHERE clauses more in the future.

C’ya next time.