Website Gone, Blog On

A few people have asked me what’s happened to my old website (colinlegg.co.uk) which is no longer available. I’ve cancelled its contract - but don’t worry: I’ve backed up the old material and I’ll publish it on here as and when I can.

Posted in General | Tagged | 4 Comments

Working With Range.FormulaArray In VBA

1 Introduction

The Range.FormulaArray property is used to set or return the array formula of a range and works in a very similar way to the more familiar Range.Formula property. However, because of the fact that it is working in relation to array formulas, there are more restrictions to consider and it is slightly more difficult to use. In the context of this article, an ‘array formula’ can be considered to mean a formula which has been entered into the formula bar using CTRL+SHIFT+ENTER rather than ENTER, so that it has been enclosed by parentheses { }.

2 Returning A Formula From A Range

If you want to return a formula from a single cell, the Range.Formula and Range.FormulaArray properties both return exactly the same result regardless of whether that cell contains an array formula or not.

However, they return different results when they are applied to a contiguous, multi-cell range. If the range is an array range (a block of cells that shares a single array formula) then the FormulaArray property returns that formula. If the range is not an array range but all of the cells contain identical formulas, then the FormulaArray property will also return that common formula. If the range is not an array range and the cells do not contain identical formulas, then the FormulaArray property returns Null. In all three scenarios, the Formula property will return an array of Variants, with each element of the array representing a formula from each cell within the range.

3 Setting An Array Formula In A Range

According to the Remarks section in ‘Range.FormulaArray Property’ topic in the VBA help file, R1C1 reference style rather than A1 reference style should be used when setting an array formula. This isn’t strictly true, although it may make problem-shooting runtime errors more straightforward. I find A1 notation much easier to use, so I will use it in all the following examples and I’ll discuss the R1C1 vs. A1 reference style question in the problem-shooting section later on.

3.1 Setting A Single-Cell Array Formula

The two points to note are that the = sign at the beginning of the string is optional and that you should not include the parentheses { } which will automatically surround the array formula once it has been assigned.

'put array formula {=FREQUENCY(A2:A10, B2:B4)} into cell E2
Sheet1.Range("E2").FormulaArray = "=FREQUENCY(A2:A10, B2:B4)"

 3.2 Setting A Multi-Cell Array Formula

If the intention is for the block of cells to be an array range (sharing a single formula) then it is as straightforward as the previous example:

'put array formula {=A2:A10="hello"} into cells C2:C10
Sheet1.Range("C2:C10").FormulaArray = "=A2:A10=""hello"""

If the intention is for each cell in the block to have its own array formula then a little bit more work has to be done. Let’s compare a few different options using this array formula as an example:

{=MAX(IF((($E$2:$E$10=A2)+($F$2:$F$10=B2))=1,$G$2:$G$10))}

This formula returns the maximum value in I2:I10 where either (but not both) of the corresponding cells in column G equals A2 or column H equals B2.

Because we want the A2 and B2 references to adjust as we ‘fill’ the array formula down the column, we cannot use the Range.FormulaArray property as we have previously.

The first option is to use a loop, for example:

Sub Option1()

    Dim r As Long

    For r = 2 To 5
        Sheet1.Cells(r, 3).FormulaArray = _
            "=MAX(IF((($E$2:$E$10=A" & r & ")+($F$2:$F$10=B" & r & "))=1,$G$2:$G$10))"
    Next r

End Sub

Every formula will be calculated irrespective of calculation settings with this option.

The second option is to populate the first cell and then copy/paste or fill it down:

Sub Option2()

    Sheet1.Range("C2").FormulaArray = _
        "=MAX(IF((($E$2:$E$10=A2)+($F$2:$F$10=B2))=1,$G$2:$G$10))"

    Sheet1.Range("C2:C5").FillDown

End Sub

This method will copy not only fill down the formulas but also the formats etc., which may be an advantage or a disadvantage depending on the situation. If calculations are set to manual then the ‘filled in’ cells will not be calculated. In my testing this method seems to be the fastest even with calculations set to automatic.

The third option is to populate the first cell and then copy it onto the clipboard and paste special formulas:

Sub Option3()

    Sheet1.Range("C2").FormulaArray = _
        "=MAX(IF((($E$2:$E$10=A2)+($F$2:$F$10=B2))=1,$G$2:$G$10))"

    Sheet1.Range("C2").Copy
    Sheet1.Range("C3:C5").PasteSpecial xlPasteFormulas
    Application.CutCopyMode = False

End Sub

This avoids copying down the formats etc. but in my testing this method seems to be much slower than all the others. If calculations are set to manual then the formulas that have been pasted will not be calculated.

A final option is as follows:

Sub Option4()

    With Sheet1.Range("C2:C5")
        'step 1
        .Formula = "=MAX(IF((($E$2:$E$10=A2)+($F$2:$F$10=B2))=1,$G$2:$G$10))"

        'step 2
        .FormulaArray = .FormulaR1C1
    End With

End Sub

The formula assignment in step 1 can be done with either the Formula or FormulaR1C1 properties. However, the FormulaR1C1 property must be used in step 2 because using the Formula property would cause the relative references to become distorted down the range. This approach performs fairly similarly to the Option 1 loop approach.

3.3 Problem-Shooting

When you’re trying to use the Range.FormulaArray property you may get the following error:

Run-time error ’1004′: ‘Unable to set the FormulaArray property of the Range class’

 The message doesn’t contain a lot of useful information so determining the cause of the problem can be quite tough. Here are some reasons why you may be getting this error message:

3.3.1 You Are Trying To Change Part Of An Array Range

For example, this code will fail:

'create an array range
Sheet1.Range("C2:C10").FormulaArray = "=A2:A10=""hello"""

'try to change part of an array range gives an ERROR
Sheet1.Range("C2").FormulaArray = "=A2:A10=""hello"""

You have to clear the array range first or change the entire array range at the same time. You can determine if a cell is part of an array range as follows:

With Sheet1
    'create an array range
    .Range("C2:C10").FormulaArray = "=A2:A10=""hello"""

    With .Range("C2")
        'check if C2 is part of an array range
        If .HasArray Then

            'what is the full array range?
            MsgBox .CurrentArray.Address
        End If
    End With
End With

3.3.2 You Are Trying To Put An Array Formula Into A Merged Cell

It is possible to put an array formula into a cell and then merge that cell with other cells, but you cannot put an array formula into a cell that has already been merged. For example, this code will fail:

'create some merged cells
Sheet1.Range("C2:C10").Merge

'try to set an array formula gives an ERROR
Sheet1.Range("C2").FormulaArray = "=A2:A10=""hello"""

You can check if a cell is part of a merged range as follows:

With Sheet1
    'create some merged cells
    .Range("C2:C10").Merge

    'check if C2 is part of a merged range
    With .Range("C2")
        If .MergeArea.Address = .Address Then
            MsgBox "Cell not merged"
        Else
            MsgBox "Cell is merged, merged range = " & .MergeArea.Address
        End If
    End With
End With

3.3.3 Your Array Formula Contains A Syntax Error Such As A Missing Or Invalid Argument

With Sheet1.Range("E2")
    'this will give an error because argument in SUM() function missing
    .FormulaArray = "=SUM()"

    'this will give an error because SUMIF() cannot accept an array data type
    'passed into its 1st or 3rd parameters: http://support.microsoft.com/kb/214286/
    .FormulaArray = "=SUMIF((A2:A19=1)*(B2:B19),B2,C2:C19)"
End With

3.3.4 Your Array Formula Exceeds 255 Characters

This issue is described on the following MS Support article:

http://support.microsoft.com/kb/213181

More specifically:

  • If you are using A1 notation then the R1C1 equivalent must be less than 255 characters. I picked this information up from MS MVP Rory Archibald.
  • If you are using R1C1 notation then the formula must be less than 256 characters.

A workaround is using the Range.Replace() method as demonstrated at DailyDoseOfExcel:
http://www.dailydoseofexcel.com/archives/2005/01/10/entering-long-array-formulas-in-vba/

I always use a variable to build up and hold the string representing the array formula I want to apply. I find it makes my code easier to read and debug. Another useful tip is that the Application.ConvertFormula() method can be used to easily convert strings between A1 and R1C1 notation (as well as toggling relative or absolute referencing).

Posted in Microsoft Excel, Microsoft Office | Tagged , , , , , , , , , , , , , , | 2 Comments

VLOOKUP Week Mug!

Towards the end of March I published three posts on here which were especially written for VLOOKUP week. If you missed VLOOKUP week then it’d be well worth your time to check out some of the fabulous articles written by Excel experts from all over the world. I managed to submit my first article in time to qualify for a VLOOKUP week mug which arrived in the post this morning!

VLOOKUP Week Mug

Posted in Off Topic | Tagged | Leave a comment

Get The Last Used Row In VBA Using Range.End

In the last post we saw a really flexible function which can pull out the last used row from a range of any dimensions. Sometimes I use a different approach though: if I know I want to look at only a single column then I might well use the Range.End property instead. Why? Well, quite simply, it’s faster.

In terms of looking for the last row of data there are two possible ways we can use Range.End:

  • Range.End(xlUp)
  • Range.End(xlDown)

These two mimic your manual actions on a worksheet when you press CTRL+[UP ARROW] and CTRL+[DOWN ARROW] respectively.

So, should we start at the top of the column and look downwards, or start at the bottom of the column and look upwards? Well, the problem with starting at the top is that if there are any gaps in the data then Range.End(xlDown) will find the gap and not the last used row. For that reason it’s best to start at the bottom and work up. To get the bottom of the column we can use Worksheet.Rows.Count, which will return 65,536 if we have an Excel 97-03 worksheet or 1,048,576 if we have an Excel 2007+ worksheet. If the entire column is empty then you’ll get 1 in the Immediate window (in the VBA IDE press CTRL+G to show the Immediate window) when you run this code:

Sub Example1()

    Dim rngLastCell As Range

    'what is the last cell (with or without data) in Sheet1!A:A ?
    Set rngLastCell = Sheet1.Cells(Sheet1.Rows.Count, "A")

    'what is the last used row in Sheet1!A:A ?
    Debug.Print rngLastCell.End(xlUp).Row

End Sub

One small detail is that if the very last cell in the column contains data then this piece of code will miss it. This is a very unusual situation so normally VBA’ers ignore it but, if it is a concern to you, then you’ll need to add an extra check:

Sub Example2()

    Dim rngLastCell As Range

    'what is the last cell (with or without data) in Sheet1!A:A ?
    Set rngLastCell = Sheet1.Cells(Sheet1.Rows.Count, "A")

    'what is the last used row in Sheet1!A:A ?
    If IsEmpty(rngLastCell.Value2) Then
        'the last cell is empty so we look upwards
        Debug.Print rngLastCell.End(xlUp).Row
    Else
        'the last cell has data so it is the one we want
        Debug.Print rngLastCell.Row
    End If

End Sub

Finally, be aware that Range.End will not correctly find the last used row in a column if that row has been hidden – for example, by an autofilter.

Posted in Microsoft Excel, Microsoft Office | Tagged , , , , , , | Leave a comment

Get The Last Used Row In VBA Using Range.Find

Here’s one of my favourite, custom functions, GetLastUsedRow.

Public Function GetLastUsedRow( _
                        ByVal rngToCheck As Range, _
                        Optional lngLookIn As XlFindLookIn = xlFormulas _
                                  ) As Long

    
    ''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
    'This function finds the last row containing data in a given range.'
    '                                                                  '
    'If the range is empty then the function                           '
    'returns the first row of the range.                               '
    '                                                                  '
    'The function will not find the last row correctly                 '
    'if it has been hidden by filtering                                '
    ''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''

    Const strTOFIND As String = "*"
    
    Dim rngLast As Range
    
    Set rngLast = rngToCheck.Find( _
                    What:=strTOFIND, _
                    LookIn:=lngLookIn, _
                    SearchOrder:=xlByRows, _
                    SearchDirection:=xlPrevious)

    If rngLast Is Nothing Then
        GetLastUsedRow = rngToCheck.Row
    Else
        GetLastUsedRow = rngLast.Row
    End If
    
End Function

It’s really flexible because it can be used on any type of range and, in particular, across multiple columns. Here are some examples:

Sub Examples()

    'from a block of cells
    MsgBox CStr(GetLastUsedRow(Sheet1.Range("D2:E8")))
    
    'from 3 entire columns
    MsgBox CStr(GetLastUsedRow(Sheet1.Range("D:F")))
    
    'from an entire sheet
    MsgBox CStr(GetLastUsedRow(Sheet1.UsedRange))
    
End Sub

The key points you need to be aware of are:

  • If the range is empty then the function will return the first row of the range.
  • If the last row has been hidden by filtering or only contains a cell prefix character then the function will not pick it up.
  • In Excel 2002 or later the function can also be used as a UDF.

Posted in Microsoft Excel, Microsoft Office | Tagged , , , | 1 Comment

SUMIF From VBA

There are three ways we can be involved with SUMIF() when we are using VBA.

  • The Range.Formula(), Range.FormulaR1C1() and Range.FormulaArray() Properties

This is a straightforward process. To enter a formula containing SUMIF() we use the Range.Formula or Range.FormulaR1C1 properties:

Sub Example1()

    '=SUM(SUMIF(B2:B7,{"Smith";"Williams"},D2:D7))
    Sheet1.Range("C1").Formula = "=SUM(SUMIF(B2:B7,{""Smith"";""Williams""},D2:D7))"

End Sub

If the formula needs to be a CSE formula, we use the Range.FormulaArray property:

Sub Example2()

    '{=SUM(SUMIF(B2:B7,F2:F3,D2:D7))}
    Sheet1.Range("C1").FormulaArray = "=SUM(SUMIF(B2:B7,F2:F3,D2:D7))"

End Sub

Just as when you manually type in an array formula, you should not include the { } at the beginning and end. Excel will automatically add these for you.

  • The WorksheetFunction Class

If we want to conditionally sum using the SUMIF() worksheet function in VBA, we can access it via the WorksheetFunction class.

Sub Example3()

    Dim varResult As Variant

    '=SUMIF(B2:B7,"Smith",D2:D7)
    varResult = Application.WorksheetFunction.SumIf( _
                    Arg1:=Sheet1.Range("B2:B7"), _
                    Arg2:="Smith", _
                    Arg3:=Sheet1.Range("D2:D7"))

End Sub

With this method, if SUMIF() returns an error then we have to add an error handler in VBA to handle it. I have included the named arguments in the example so you can see how uninformative they are and the little benefit intellisense gives us. Ideally we would like to see range, criteria and sum_range, but instead we have to work with arg1,arg2 and arg3.

Another way is to call it from the Application class:

Sub Example4()

    Dim varResult As Variant

    '=SUMIF(B2:B7,"Smith",D2:D7)
    varResult = Application.SumIf( _
                    Sheet1.Range("B2:B7"), _
                    "Smith", _
                    Sheet1.Range("D2:D7"))

End Sub

Using this approach, an error returned by SUMIF() will be held within the vResult variable which we can then handle with the CVErr() function. For further information and comparisons between these approaches have a read of this thread at dailydoseofexcel.

We come back to the earlier point (section 3.1) that the 1st and 3rd parameters expect range references. When you are calling SUMIF() via the WorksheetFunction class, do not be tempted to include any additional brackets around the range argument you are passing in.

Sub Example5()

    Dim varResult As Variant

    'type mismatch or object required error
    varResult = Application.WorksheetFunction.SumIf( _
                    (Sheet1.Range("B2:B7")), _
                    "Smith", _
                    Sheet1.Range("D2:D7"))

End Sub

The extra brackets around Sheet1.Range("B2:B7") cause that range to be de-referenced into an array which, as was mentioned the previous blog post, is unacceptable.

  •   Application.Evaluate() and Worksheet.Evaluate() Methods

For the more complex SUMIF() formulas, the easiest way to calculate them directly within VBA is to use the Worksheet.Evaluate method. This method (more often than not) gives the same result as when the formula has been CSE entered, so we can treat normal and CSE formulas in the same way.

Sub Example6()

    Dim varResult As Variant

    '=SUM(SUMIF(B2:B7,{"Smith";"Williams"},D2:D7))
    varResult = Sheet1.Evaluate("=SUM(SUMIF(B2:B7,{""Smith"";""Williams""},D2:D7))")

    '{=SUM(SUMIF(B2:B7,F2:F3,D2:D7))}
    varResult = Sheet1.Evaluate("=SUM(SUMIF(B2:B7,F2:F3,D2:D7))")

End Sub

Again, errors returned by the evaluated expression will be stored within the variant variable. The Evaluate method is qualified by the Sheet1 codename, so all of the unqualified range references within the string expression will be considered to belong to that sheet.

Posted in Microsoft Excel, Microsoft Office | Tagged , , , , , , , , | Leave a comment

SUMIF Formulas

1 Introduction

At some point in your Excel work you are almost certainly going to want to conditionally sum a range of cells. If you find yourself in that situation then SUMIF() is probably going to be what you need: it’s a very important Excel worksheet function which can be used to sum a range of cells based upon a given criterion. The great news is that, once you’ve invested some time in understanding SUMIF(), it will take you very little effort to get up to speed with the other “IF” worksheet functions: COUNTIF(), COUNTIFS(), AVERAGEIF(), AVERAGEIFS() and SUMIFS(). This post is an updated version of a knowledge base article I wrote a few years ago for an online forum.

2 Syntax

SUMIF() has three parameters which appear in this order:

SUMIF(range, criteria, [sum_range])

  • Range is the area of the worksheet where you want to check if a condition has been met.
  • Criteria is the condition that must be met within range.
  • If the criteria is met in  range then the corresponding cells in sum_range will be summed.

In the example below we have a simple worksheet to illustrate the syntax. Column B contains a list of names and column D has a list of corresponding amounts. We want to determine the sum of all amounts relating to Smith so we add a criterion into a spare cell and construct a SUMIF() formula in cell G2. Each cell in column B is checked to see if it contains the word Smith and, if it does, the corresponding cell from column D is included in the sum. The result is 1 + 4 = 5.

 

Before we look at some more examples, let’s go through some important rules and gotchas which you need to be aware of. In particular, let’s have a look at the values (arguments) which can be passed into each parameter.

3 Range and Sum_Range

3.1 Range References Only!

The argument you pass into the range parameter must be a range reference: if you try to pass in anything else then Excel’s expression evaluator will reject it and tell you that the formula contains an error. In this screenshot I have attempted to pass an array constant into it:

The sum_range parameter is optional. If you use it, you must also pass in a range reference. As noted in the help file’s remarks, if it is omitted then the cells in the range parameter are summed.

As we have discussed above, you cannot pass an array into the first and third parameters of the SUMIF() worksheet function. One implication of this is that we cannot use SUMIF() on closed workbooks. When the referenced workbook is closed, the ranges within that workbook which we have referred to in our formula become de-referenced into arrays and therefore SUMIF() returns an #VALUE! error. The workaround is to use an array or SUMPRODUCT() formula, which is shown in the link titled “SUMIF, COUNTIF, COUNTBLANK Return #VALUE! Error” provided in the links section at the end.

3.2 Dimensions of Sum_Range

There is an interesting point to be made regarding the dimensions of the range and sum_range parameters: if you include a sum_range range, it only ever defines the top left cell of the range to be summed. The remaining area of the sum_range is actually defined by the dimensions of the range argument. In the below example only D2 has been specified in the sum_range parameter but SUMIF() looks at the dimensions of range – 1 column x 6 rows – and therefore considers the sum_range to be D2:D7, which is 1 column x 6 rows starting from D2. The total is still 5.

  

However, in earlier versions of Excel having range and sum_range unequally sized can cause problems: there is a link in section 6.2 called “XL2000: Formula Using SUMIF() Worksheet Function Does Not Recalculate Sum” which describes the issue.

In more recent versions of Excel (post XL 2000) this problem has been fixed, but the trade-off is that SUMIF() formulas where sum_range has less rows or columns than range are volatile. Therefore I recommend that it is good practice to make the range and sum_range equal sizes.

4 Criteria

So far we have covered the range and sum_range parameters but we haven’t had a proper look at the criteria parameter. The help file only touches on some of the expressions we can use.

4.1 Criteria Data Types

 We can pass in many different data types or references including:

  • References to ranges or names
  • Strings
  • Numbers
  • Logical values
  • Error values
  • Arrays

Usage of these data types except for error values is covered in the examples in the remainder of this article. Section 4.4 specifically addresses arrays.

We can also build up expressions which include comparison operators and wildcards. Firstly, let’s have a look at the comparison operators.

4.2 Comparison Operators

  • =        Equal to
  • <>      Not Equal To
  • >        Greater Than
  • >=      Greater Than Or Equal To
  • <        Less Than
  • <=      Less Than Or Equal To
4.2.1 Equal To

In the examples we’ve seen so far we have implicitly used the equals operator. The equal to operator should be used when you want to conditionally sum where range equals a particular value. The following formulas are equivalent of each other; equals in the first example is implied.

=SUMIF(B2:B7,"Smith",D2:D7)

=SUMIF(B2:B7,"=Smith",D2:D7)

It’s important to note that the condition is not case sensitive. The following are equivalent:

=SUMIF(B2:B7,"Smith",D2:D7)

=SUMIF(B2:B7,"SMITH",D2:D7)

It’s often considered good practice to place the criteria argument in a cell and to reference it from there. To replicate the first form above we can reference the cell directly; to reference the second form above we can use the concatenation operator:

=SUMIF(B2:B7,F2,D2:D7)

=SUMIF(B2:B7,"="&F2,D2:D7)

The first of these two formulas is the one we used in the syntax section.

It is important to note that whilst these two formulas will often produce the same results, they are actually different due to the string data type coercion in the second formula. If F2 is empty then the first formula will sum cells in D2:D7 when corresponding cells in B2:B7 are 0; the second formula will sum cells in D2:D7 when corresponding cells in B2:B7 are empty.

4.2.2 Not Equal To

The not equal to operator should be used when you want to conditionally sum where range does not equal a particular value. Let’s sum the amount paid for names which are not equal to Smith:

=SUMIF(B2:B7,"<>Smith",D2:D7)

We can also place Smith in a cell and reference it from there:

  

The answer is -2 + -3 + 5 + 6 = 6.

A third way would be to place <>Smith in cell F2 and reference it from there:

=SUMIF(B2:B7,F2,D2:D7)

4.2.3 Greater Than / Greater Than Or Equal To / Less Than / Less Than Or Equal To

These operators are typically used for number comparisons. To sum only positive numbers from our amount column we can use formulas like this:

=SUMIF(D2:D7,">0")

=SUMIF(D2:D7,">"&F2)

Again, take care with your data types. For the second formula to work, F2 must contain a 0 (don’t leave it empty).

The answer is 1 + 4 + 5 + 6 = 16

You can also place >0 in F2 and then reference F2:

=SUMIF(D2:D7,F2)

These operators can also perform non-case sensitive text comparisons. Let’s get a total amount for all surnames beginning with “s” or later:

The answer is 1 + -2 + 4 + 5 + 6 = 14

Now let’s have a look at some wildcard tokens.

4.3 Wildcards

The wildcards we have available to us are:

  • *     represents any number of characters
  • ?      represents a single character
  • ~     means that the next character should be treated as a literal

The wildcards are only applicable when range contains text.

Perhaps we want to sum amounts where Smith appears anywhere (for example in double-barrel names) and to include names such as Smithies:

=SUMIF(B2:B7,"*Smith*",D2:D7)
=SUMIF(B2:B7,"*"&F2&"*",D2:D7)

The answer is 1 + 4 + 5 + 6 = 16.

Perhaps we want to sum where the name is exactly five letters long:

=SUMIF(B2:B7,"?????",D2:D7)

The REPT() worksheet function is very useful in this situation because it allows us to dynamically control the number of ? repeated wildcards in the criteria expression:

  

The answer is 1 + -3 + 4 = 2.

The ~ wildcard should be used in situations where you want either the *, ? or ~ wildcards to be treated as literals rather than as wildcards. So, to have * treated as a literal you would use "~*".

The next sub-section looks at the more advanced topic of passing array arguments into the criteria parameter.

4.4 Using Arrays In the Criteria Parameter

SUMIF() is generally considered to only be capable of summing by one criterion. This is true but, in certain circumstances, we can use an array in the criteria parameter to expand the capability.

If we want to sum the amounts where the name is Smith OR Williams then we could use a formula like this:

=SUMIF(B2:B7,"Smith",D2:D7)+SUMIF(B2:B7,"Williams",D2:D7)

However there is another way of doing this which is more concise when the number of OR conditions increases. Instead we can use this:

=SUM(SUMIF(B2:B7,{"Smith";"Williams"},D2:D7))

This formula, which uses an inline, vertical array constant, does not need to be CSE entered.

When an array is passed into criteria, SUMIF() has been registered to return an array as an answer. The returned array will have the same dimensions as the criteria array. In this case we passed in a 2 element vertical array so the result returned by SUMIF() will also be a 2 element vertical array. The first element in the result is the sum corresponding to Smith; the second element is the sum corresponding to Williams. We want the sum of the result array so we simply wrap the SUMIF() function with the SUM() function.

SUMIF() returns an answer of {5;-2}. The 5 is the sum for Smith; the -2 is the sum for Williams. SUM() adds these together: 5 + -2 = 3.

We require specific circumstances to be able to condense the formula down in this manner:

  • The range and sum_range arguments must be the same for all criteria.
  • The criteria must be mutually exclusive: this is very important because, if they are not, then any overlap will result in the same value being summed multiple times. It may be possible to add an adjustment to the formula to allow for values which meet multiple criteria, but the adjustment would have to be carefully designed to ensure accurate results.

Of course, we may not always want to use an inline array constant. If we want to use a named array constant, then either the formula has to be CSE entered or we have to use SUMPRODUCT() instead of SUM():

With a named array constant CRITERIA defined as ={"Smith";"Williams"}

=SUM(SUMIF(B2:B7,CRITERIA,D2:D7))                CSE Entered

or

=SUMPRODUCT(SUMIF(B2:B7,CRITERIA,D2:D7))

We also may wish to use a range of cells containing the criteria. Suppose F2 and F3 contain Smith and Williams respectively, we also have to use either of the CSE or the SUMPRODUCT() constructions:

=SUM(SUMIF(B2:B7,F2:F3,D2:D7))                       CSE Entered
=SUMPRODUCT(SUMIF(B2:B7,F2:F3,D2:D7))

The reason we have to use these constructions for named array constants and ranges is that we have to   explicitly de-reference them to arrays. In these examples I have passed vertical array/range arguments into the criteria parameter but please note that horizontal or 2-D arguments are equally viable.

Note that we can also use wildcards within the array.

There are other SUMPRODUCT() formula constructions which do not use SUMIF() which can calculate this result. However, testing shows that this approach is more efficient than other SUMPRODUCT() variations.

We have now covered all of the main variations. In the following section we will look at some SUMIF() frequently asked questions.

5 Frequently Asked Questions

  • How Can I Sum Cells Which Correspond To Empty Cells?

=SUMIF(A2:A10,"=",B2:B10)

  • How Can I Sum Cells Which Correspond To Non-Empty Cells?

=SUMIF(A2:A10,"<>",B2:B10)

  • How Can I Sum Cells Which Contain Error Values?

=SUMIF(A2:A10,"<="&9.9999999999999E307)

To maintain international compatibility (the . decimal separator may be an issue) the big number is concatenated with the <= operator. Under the Excel specifications and limits topic in the Excel help file you can see that 9.99999999999999E307 is the largest number allowed to be typed into a cell. A side effect of concatenating the number means that we have to drop the last digit in our formula. These two alternatives are also tidy:

=SUM(SUMIF(A2:A10,{"<0",">0"}))

=SUMIF(A2:A10,"<0")+SUMIF(A2:A10,">0")

  • How Can I Sum Cells Which Correspond To Text?

If you have a range containing a mixture of data types and you only want to sum cells containing text, then these two variations could be useful:

=SUMIF(A2:A10,"*",B2:B10)

=SUMIF(A2:A10,"?*",B2:B10)

The difference between them is that the first one will sum corresponding cells in column B where there is a formula which returns an empty string in column A, e.g. ="". Note that both of these ignore non-text types (ie. errors, booleans, numbers) in range.

  • How Can I Sum Between Two Dates?

With the start and end dates to be referenced in cells E2 and E3, we can use:

=SUMIF(A2:A10,">="&E2,B2:B10)-SUMIF(A2:A10,">"&E3,B2:B10)

This formula is a little tricky to come up with, but once we have it in front of us we can see that it is quite simple. The first SUMIF() adds up all the cells in B2:B10 where the corresponding cells in A2:A10 are greater than or equal to the begin date. Then we just have to subtract any numbers which fall after the end date, which we do with the second SUMIF().

In Excel 2007 and later we can use SUMIFS() which is a better option.

Another way of calculating this with a formula is to use SUMPRODUCT(), but this is not as good because SUMPRODUCT() is more resource hungry than two SUMIF() functions, so the formula will be slower to calculate.

  • How Can I Sum Values Based On Two Conditions?

It depends on what the two conditions are but usually when this question is asked the requirement is to sum only when two separate columns are both equal to certain values.

We can’t do this using SUMIF() directly on the source data because we can’t synchronise the criteria across the two columns. A workaround is to use a helper column which will indicate whether or not both criteria have been met on each row. Here’s an example where we only want the sum of Amount when both the First Name is Michael and the Surname is Smith:

There are various formulas we can use in the helper column. In this example we added this formula into cell F2 and then copied it down to row 7:

=AND(B2=H$2,C2=I$2)

When both conditions are met the formula returns TRUE, otherwise it returns FALSE. If there was an error value in B2:C7 then it would return an error value so in that situation we would have to use a slightly different helper column formula.

We then just sum the amounts where the helper column is TRUE using the SUMIF() formula in cell J2.

Again, there are numerous alternatives to using SUMIF() to do this. In Excel 2007 and later the SUMIFS() worksheet function is a better alternative which does not require a helper column. DSUM(), SUMPRODUCT(), SUM() CSE formulas and pivot tables are also frequently used.

  • How Can I Sum A Range Which Has Different Dimensions To The Criteria Range?

Section 3.2 explained that the dimensions of sum_range are determined by the dimensions of range. This means that with SUMIF() the most practical way to achieve this task is almost always by using a helper column for the sum_range which has the same dimensions as range.

In the example below each person has three amounts and we want to find the total amount for anyone with the name Smith. We use a helper column to create individual totals (which is probably useful information in itself) and then we reference that column from the SUMIF() formula.

The formula placed in G2 and filled down the G column is a simple SUM() formula:

=SUM(D2:F2)

Again, there are alternatives such as SUMPRODUCT() or a pivot table.

6 Links, Bugs and Further Reading

6.1 Microsoft Office Online

 6.2 Microsoft Help and Support

7 Acknowledgements

  • www.xtremevbtalk.com member kassyopeia for some great ideas and suggestions, particularly on array criteria (section 4.4).
  • Luke Wisbey for pointing out that SUMIF() can be volatile (section 3.2).

Posted in Microsoft Excel | Tagged , , , , , , , , , , , , , , , , , , | 2 Comments

Using WorksheetFunction.VLookUp To Look Up A Date In VBA

VLOOKUP is exposed in VBA via the WorksheetFunction class and works much the same as it does when you use it in a worksheet, but one thing you have to be careful of is when you have a date as the lookup_value. Let’s use the data from a previous blog post to demonstrate the problem:

The formula in cell I2 clearly works without issue, but when we replicate it in VBA we may get this error:

Run-time error ’1004′: Unable to get the VLookup property of the WorksheetFunction class.

This error means that it can’t find the lookup_value in the lookup column (equivalent of #N/A) and, in this case, we will get this error whether we take the lookup_value date directly from the H2 cell using Range.Value or if we use a Date variable. You can try running the two procedures below to see this for yourself.

Sub Error_Example1()

    Const blnRANGE_LOOKUP As Boolean = False
    Const lngCOL_INDEX_NUM As Long = 2
    
    Dim varResult As Variant
    Dim rngLookUpValue As Range
    Dim rngTableArray As Range
    
    With Sheet1
        Set rngLookUpValue = .Range("H2")
        Set rngTableArray = .Range("E2:F11")
    End With
    
    'Runtime Error 1004
    varResult = Application.WorksheetFunction.VLookup( _
                                                rngLookUpValue.Value, _
                                                rngTableArray, _
                                                lngCOL_INDEX_NUM, _
                                                blnRANGE_LOOKUP)

End Sub

Sub Error_Example2()

    Const blnRANGE_LOOKUP As Boolean = False
    Const lngCOL_INDEX_NUM As Long = 2
    Const dteLOOKUP_VALUE As Date = #1/3/2012# '3rd Jan 2012
    
    Dim varResult As Variant
    Dim rngTableArray As Range
    
    Set rngTableArray = Sheet1.Range("E2:F11")
    
    'Runtime Error 1004
    varResult = Application.WorksheetFunction.VLookup( _
                                                dteLOOKUP_VALUE, _
                                                rngTableArray, _
                                                lngCOL_INDEX_NUM, _
                                                blnRANGE_LOOKUP)

End Sub

To understand why there is a problem we have to put on both our Excel and VBA hats. In the previous post, Why Does VLOOKUP Return An #N/A Error?, we saw that VLOOKUP is data type sensitive: for example, VLOOKUP considers the string “3″ to be different to the number 3. The post also mentioned that Excel worksheets can only contain five data types: numbers (also known as doubles), text (also known as strings), logicals (also known as booleans), error values and arrays. You’ll notice that Excel does not have a Date data type: in Excel, dates and times are numbers which have just been formatted to look like dates. In VBA, however, there is a Date data type and, in this case, when we use the Range.Value property to retrieve the value from the cell H2, it returns a Date type. VLOOKUP is unable to match this Date type to the numbers (formatted as dates) the the lookup column, which is why we get the error.

There are a number of workarounds to the issue. Essentially we have to make sure that the lookup_value data type matches the data types in the lookup column. Since there’s only one lookup_value it makes sense that we concentrate on changing its type rather than anything else.

If you are reading the date lookup_value from a cell then you can either just pass the range reference into VLOOKUP (exclude the .Value), or you can use the Range.Value2 property instead of the Range.Value property. The Range.Value2 property doesn’t use the Date data type so, by using Range.Value2 instead, we get a Double data type holding the value 40911. VLOOKUP will be able to process that quite happily. Another option would be to change the formatting of the cell H2 to that which represents a number but it would not be a very robust solution. Here’s the first error example re-written so that it uses the Range.Value2 property:

Sub Correct_Example1()
    
    Const blnRANGE_LOOKUP As Boolean = False
    Const lngCOL_INDEX_NUM As Long = 2
    
    Dim varResult As Variant
    Dim rngLookUpValue As Range
    Dim rngTableArray As Range
    
    With Sheet1
        Set rngLookUpValue = .Range("H2")
        Set rngTableArray = .Range("E2:F11")
    End With
    
    'correctly returns 0.957583452240653
    varResult = Application.WorksheetFunction.VLookup( _
                                                rngLookUpValue.Value2, _
                                                rngTableArray, _
                                                lngCOL_INDEX_NUM, _
                                                blnRANGE_LOOKUP)
    
End Sub

Yet another option, which will work whether you are picking up the date from a cell or using a Date variable, is to explicitly convert the Date into a Double or Long data type using the CDbl or CLng functions. Note that Long data types would apply for whole dates only. Here’s the second error example re-written with the CDbl function included:

Sub Correct_Example2()
    
    Const blnRANGE_LOOKUP As Boolean = False
    Const lngCOL_INDEX_NUM As Long = 2
    Const dteLOOKUP_VALUE As Date = #1/3/2012#  '3rd Jan 2012
    
    Dim varResult As Variant
    Dim rngTableArray As Range
    
    Set rngTableArray = Sheet1.Range("E2:F11")
    
    'correctly returns 0.957583452240653
    varResult = Application.WorksheetFunction.VLookup( _
                                                CDbl(dteLOOKUP_VALUE), _
                                                rngTableArray, _
                                                lngCOL_INDEX_NUM, _
                                                blnRANGE_LOOKUP)
    
End Sub

Posted in Microsoft Excel, Microsoft Office | Tagged , , , , , , , | Leave a comment