Binary Searches With VLOOKUP

I went for a job interview about a year ago and the interviewer made a passing comment that he didn’t understand why anyone would ever pass TRUE into VLOOKUP’s  range_lookup parameter. There are lots of cases when you might want to do this so I thought a few, nice examples would be a great way to get VLOOKUP week underway.

Before we dive into the examples, let’s quickly look at what the range_lookup parameter is and what is does. Here’s the VLOOKUP signature from the Excel help file:

VLOOKUP(lookup_value, table_array, col_index_num, [range_lookup])

Range_lookup is the fourth and final parameter. You can see that it is surrounded by square brackets which means that it is optional: when you write your VLOOKUP formulas you don’t have to include it.

The help file also tells us that range_lookup expects to be given a logical value which specifies whether VLOOKUP should find an exact or approximate match. Logical values (also known as boolean values) are TRUE and FALSE: you should use TRUE to indicate an approximate match and FALSE to indicate an exact match. Sometimes you’ll see people using 1 and 0; in Excel, 1 is the equivalent of TRUE and 0 is the equivalent of FALSE, so they’re both fine too. If you omit the fourth argument then VLOOKUP will use TRUE by default.

When you tell VLOOKUP to use an exact (FALSE) match, it performs a linear search. It will start at the top of the first column in table_array and work its way downwards, cell by cell, hunting for an exact match to the lookup_value. When it finds an exact match, it will look across that row to the nth column of the table_array as directed by col_index_num and return the value from that cell.

Since “the first column of table_array” is a bit of a mouthful, going forward I might refer to it as the lookup column.

When you tell VLOOKUP to use an approximate match, it works in a different way. The help file gives us a clue how it does the search by giving us this important piece of information:

 If range_lookup is either TRUE or is omitted, the values in the first column of table_array must be placed in ascending sort order; otherwise, VLOOKUP might not return the correct value.

When VLOOKUP is doing an approximate match, it relies on the fact that the lookup column is sorted in ascending order. This means that it doesn’t have to search in a top to bottom manner; rather it can quickly eliminate chunks of data by jumping up and down through it (binary search). When VLOOKUP is searching this way it will return an exact match if possible, otherwise it will return the next largest value that is less than lookup_value.

It is very important to note that the approximate match algorithm is much faster to calculate than the exact match algorithm, so you should use it whenever possible. For more information on this, check out Charles Williams’ blog.

Right, now it’s time to put the theory into practice!

Example 1

  • The lookup column is in ascending order and you want to find an approximate match

When I see ‘approximate match’ I always start thinking about data bucketing. A classic example is when you are trying to assign grades to students depending on their exam marks.

The table on the left contains the students’ names and the marks they’ve scored in their exams. On the right there’s a table which shows the mark they have to get to attain each grade. An 80% mark would give them an A, whilst anything under 30% will give them an E. It’s important that the lookup table has the lowest mark possible: if a student has a mark lower than anything in the table then the VLOOKUP formula we’re about to use will return an error. The VLOOKUP formula to calculate their grades is this:

=VLOOKUP(C2,$F$2:$G$6,2,TRUE)

James gets an E grade for his English because VLOOKUP couldn’t find an exact match to his 24% mark in the lookup column, so it returned an E which corresponds to 0% – the next largest value that is less than lookup_value.

Example 2

  • The lookup column is in ascending order, you are certain that the lookup_value will be found and you want to find an exact match.

In this example we have a list of dates and the name of the employee working on each date. We want to create a formula which will let the user quickly find out who the employee is on any given date.

We’re going to assume that we know from the business model and from the way the data is entered that the date we want to find will always be in the table. This is enforced by a data validation list.

Since the lookup column is sorted in ascending order and we know that the date we’re using will always be found, we can use the faster, approximate match algorithm to find the employee working on a particular day.

=VLOOKUP($E$2,$A$2:$B$11,2,TRUE)

Example 3

  • You want to use the VLOOKUP function in a clever way whereby the sort order of the lookup column does not matter.

Okay, this one sounds a bit weird but there are a few occasions when the sort order doesn’t actually matter. A nice example is when you want a formula which will return the last number in a column. Here’s an unruly column of data with numbers, errors, text and empty cells:

The last number in the column is 123 and this charming little VLOOKUP formula will find it for us:

=VLOOKUP(BIGNUM,E:E,1,TRUE)

BIGNUM is the key to this: it has to be a number larger than any of the numbers which may appear in the column. In the ‘Excel specifications and limits’ topic in the help file it says that the largest allowed positive number by a formula is 1.7976931348623158e+308. If you want to use that as your BIGNUM then, curtosy of Stephen Gersuk, go into name manager and define BIGNUM as follows:

=9.99999999999999E+307*1.79769313486231

About Colin Legg

RAD Developer Microsoft MVP - Excel 2009 - 2014
This entry was posted in Microsoft Excel and tagged , , , , , , , , , , , . Bookmark the permalink.

10 Responses to Binary Searches With VLOOKUP

  1. Pingback: Colin Legg and “Binary Searches With VLOOKUP” « VLOOKUP WEEK

  2. Pingback: Why Does VLOOKUP Return An #N/A Error? | RAD Excel

  3. Pingback: A look at Index & Match functions « The Data Specialist

  4. Debbi says:

    I have this formula =IFERROR(VLOOKUP($B2,Sheet2!$A:$AA,26,FALSE)&””,””) where if an error is found either in the lookup value or the value to be returned then return blank, otherwise find $b2 (which is text) in Sheet2 column A then return what is in the 26th column (which is a date). I’m getting an accurate return value, however even though the Sheet2 column is date formatted and the column in which the vlookup formula is in is also date formatted, the value is returned as a number. And though it looks like it is a number stored as text, it doesn’t get the error notification that excel would normally add to the cell which states the number stored as text.

    Please help if you can.

    Like

    • Colin Legg says:

      Hi Debbi,

      This question doesn’t really have anything to do with approximate matches using VLOOKUP() which is what the post was about, but I’ll try to help you.

      In Excel all dates are really numbers which have just been formatted to look like dates. Looking at your formula:

      =IFERROR(VLOOKUP($B2,Sheet2!$A:$AA,26,FALSE)&"","")

      The bit of the formula which strikes out a chord is the &"".
      Let’s suppose that the VLOOKUP() part of your formula returns the number 41276 which, on my date system, is the equivalent of 2nd January 2013.
      Next the &"" will coerce that number into a string type. In other words, the number 41276 becomes the string "41276".
      The date formatting in your cell will only be applied to number types, so the cell will just show "41276" (without the "" because the "" are just tokens which denote a string).

      The error notification about numbers being stored as text only applies when the cell contains a constant. In this case the cell contains a formula so the green triangle will not be displayed.

      It’s not apparent to me why you have the &"" in the formula, so the way forward would be simply to remove it:
      =IFERROR(VLOOKUP($B2,Sheet2!$A:$AA,26,FALSE),"")

      I hope that helps,
      Colin

      Like

      • Debbi says:

        Hi Colin,

        Thanks for taking the time to look at my problem, even though I posted in the wrong place.

        Removing the &” ” does help, as in it will display the number in date format, however the reason behind having it in the formula is to display a blank if cell value to return (what is in column 26) is blank instead of 0/01/1900. If there is a better way to do that, I would appreciate the help.

        Thanks again,
        Debbi

        Like

      • Colin Legg says:

        Hi Debbi,

        I think the best way to deal with blanks would be to make your cell formatting a little bit fancier. Stick with the amended formula I suggested and then give it a custom format like this:
        dd/mm/yyyy;;

        Change the date format above as you need.

        Regards,
        Colin

        Like

      • Debbi says:

        Hi Colin,

        Awesome, you are fantastic, thank you so much for your help. I had been on this problem for ages and no matter how many google searches I did I couldn’t find the answer.

        Thanks again,
        Debbi

        Like

  5. Pingback: VLOOKUP works in some cells but not others. | Chris Memo

  6. Pingback: daily 08/01/2015 | Cshonea's Blog

Leave a comment