Self Extending UDFs – Part 1

If you’re an advanced Excel user then I’m sure that at some point you’ve become frustrated with the clumsiness of array formulas. I’m talking about the situations where you have to fiddle around with your worksheet in an attempt to resize an array formula so the number of cells it belongs to (the array range) matches the size of the array it returns. If you make the array range too small then you risk not displaying an important piece of information; make it too large then you get unsightly #N/A errors for the elements which are out of range. To cap it off you have to put up with that galling ‘You cannot change part of an array’ error which is particularly prevalent you’re trying to shrink the formula. Ugh.

Of course, practice and various tools can make this a bit easier, but wouldn’t it be nice if the formulas just knew how big or small they should be? Darn right, it would. To that end, I’ve created a VBA add-in called RADExcelFormulaExtensions.xlam which has a couple of worksheet functions, rxlEXTEND() and rxlRESIZE(), which provide two different ways of letting the formula manage itself. Curious? Windows user? Read on.

First, a disclaimer. There are a huge number of limitations when you’re using these two functions, so I’m not for one minute suggesting that they should become the de facto method for handling multi-cell array formulas. They’re not ‘best practice’. If you have an industrial, complex workbook then you shouldn’t use them in it. The space for these functions is as a useful, day-to-day utility to make a quick bit of analysis easier to stomach. If you have UDFs which return arrays from databases/services then you’re going to love these puppies.

So How Do I Get My Hands On The Add-In?

The add-in can be downloaded from here but it will only work for Windows users with Excel 2007 or later. It should work in 64-bit Excel but I haven’t tested it so if you have any problems then please let me know.

I’m assuming you’re an advanced Excel user so you already know how to load an add-in into your Excel session. If you’ve done it correctly, you should see a small section titled RAD Excel Formula Extensions on the Formulas tab of the ribbon which looks something like this:

RibbonX

More on what those options do later. For now, all you need to know is that, once you’ve loaded the add-in, you’re ready to rock and roll.

There is so much I could write about on this add-in that this post would end up being the size of a book. Well, a small book. For now, I’m just going to cover how to use these functions and exactly what it is they do from an end-user standpoint. I’ll let your feedback drive what I write about in follow-up posts and, as and when time allows, I also plan to add more functions to the add-in.

rxlRESIZE()

The rxlRESIZE() function can be used in a formula so that it automatically resizes to fit the array it returns. It was initially based on the Resize() function written by Govert van Drimmelen for his fabulous Excel-DNA product, with a few customisations to help control its behaviour on the worksheet.  Enough chat. Let’s see it in action.

Watch what happens when I enter this simple formula:

=C1:C10

simpleformulanoextension

Not a lot. It just returns the first value in the referenced range. Now watch what happens when I wrap the same formula with the rxlRESIZE() function:

=rxlRESIZE(C1:C10,-1,1)

SimpleResize

The rxlRESIZE() function works out that the formula needs to be 10 rows x 1 column and resizes the array formula to fit those dimensions.

If I insert a cell in the referenced range, the formula resizes itself again:

ResizeNewRow

Pretty cool, eh? Let’s do a more complicated one.

In the next example, I’ve written a VBA UDF called LISTUNIQUE() which lists the unique values (values which occur only once) within a given range. The problem with listing unique values is you never know how many there could be. If the source range is ten cells then there could be no unique values or there could be ten unique values. It could change too if the values in the cells are changed. Because there could be up to ten unique values, I have to enter the array formula across ten cells to make sure that all the unique values are displayed. The problem is that if there aren’t ten unique values then I get #N/A errors:

ListUniqueNoResize2

Let’s rxlRESIZE() it instead:

ListUniqueWithResize

Since my LISTUNIQUE() UDF can check multiple columns but always returns values in a single column, instead of using -1 (see syntax section below) for the maximum number of rows, I could use:

=rxlRESIZE(LISTUNIQUE(C1:C10),ROWS(C1:C10)*COLUMNS(C1:C10),1)

If the size of C1:C10 is changed then the maximum number of columns will change accordingly.

I haven’t included the source code for the LISTUNIQUE() function because it’s not part of the add-in, but there are plenty of examples out there if you want to give this particular scenario a try yourself.

rxlRESIZE() Syntax

=rxlRESIZE(ResizeValues,[MaxRows],[MaxCols])

  • ResizeValues is a Variant and can be a literal, an array of literals or a Range. It can also be a set of functions which return any of these types.
  • [MaxRows] is optional. It is a Long which indicates the maximum number of rows the formula can be resized to. The default value is 0. If a value less than 0 is used then the formula will resize to as many rows as is required to show all the row values in ResizeValues. If a value of 0 or 1 is used then the maximum number of rows will be 1.
  • [MaxCols] is optional. It is a Long which indicates the maximum number of columns the formula can be resized to. The default value is 0. If a value less than 0 is used then the formula will resize to as many columns as is required to show all the column values in ResizeValues. If a value of 0 or 1 is used then the maximum number of columns will be 1.

If MaxRows and MaxCols are both either 0 or 1 then no resizing will occur. I have set this as the default to stop the user from accidentally clearing/overwriting data in the worksheet. The numbers provided in these arguments have a base value of 1 so if you specify 10 and 10 then it will resize to no larger than a 10×10 matrix.

rxlRESIZE() Limitations and Features

  • Excel 2007 or later
  • Windows operating system
  • Function must be outermost one in the formula
  • If worksheet contents are protected then the relevant cells must not be locked
  • Should be avoided with slow formulae because the act of resizing causes the formula to calculate twice
  • Treat it like an array formula: avoid using whole row or column references whenever possible
  • Should not be used with automation. The formulas will not recalculate while other VBA code is running. Setting Application.EnableEvents to False in your automation code will cause the functions to return an error if you have the short-circuit calculation loops option ticked (see Ribbon section)
  • Should be used with caution on worksheets with Change event handlers
  • No allowance has been made for merged cells
  • Cannot be used with volatile functions (including in precedent cells)
  • If a volatile formula sits on the same worksheet then that formula will calculate twice
  • If you alter the formula then you need to use CTRL+SHIFT+ENTER to complete your changes
  • Destructive: it will overwrite existing data in the worksheet except for pivot tables and tables

rxlEXTEND()

The rxlEXTEND() function takes an array result within a single-cell formula and automatically populates that array in the cells surrounding the formula. This is a very unorthodox approach which breaks Excel’s calculation tree.

Here’s the simple =C1:C10 formula used in the first rxlRESIZE() example, except wrapped with rxlEXTEND() instead:

=rxlEXTEND(C1:C10,-1,1)

SimpleExtend

Scary stuff. The formula in E1 is managing the data in the cells underneath it.

For the second example, instead of using my LISTUNIQUE() function, I’m going to have a go with the LISTDISTINCTS() function which comes with the latest version of FastExcel. In terms of output, it’s similar to my LISTUNIQUE() function except that it returns a list of the different values from the range rather than a list of unique values.

 ListDistinctsExtend

I believe that the LISTDISTINCTS() function only looks at a single column and returns values in a single column so, in this case, to avoid using -1 in MaxRows – which will wipe out every cell under C1 (see syntax section) – a better option would be to use the ROWS() function to work out the maximum number of rows rxlEXTEND() needs to touch:

=rxlEXTEND(LISTDISTINCTS(C1:C10),ROWS(C1:C10),1)

rxlEXTEND() Syntax

=rxlEXTEND(ExtendValues,[MaxRows],[MaxCols])

  • ExtendValues is a Variant and can be a literal, an array of literals or a Range. It can also be a set of functions which return any of these types.
  • [MaxRows] is optional. It is a Long which indicates the maximum number of rows below the formula which can be written to. The default value is 0. If a value less than 0 is used then as many rows as are required will be written to in order to show all the row values in ExtendValues. If a value of 0 or 1 is used then the maximum number of rows will be 1.
  • [MaxCols] is optional. It is a Long which indicates the maximum number of columns the formula can be resized to. The default value is 0. If a value less than 0 is used then the formula will resize to as many columns as is required to show all the column values in ExtendValues. If a value of 0 or 1 is used then the maximum number of columns will be 1.

If MaxRows and MaxCols are both either 0 or 1 then no resizing will occur. I have set this as the default to stop the user from accidentally clearing/overwriting data in the worksheet. The numbers provided in these arguments have a base value of 1 so if you specify 10 and 10 then it will resize to no larger than a 10×10 matrix.

It’s important to understand that there is a fundamental difference between rxlEXTEND() and rxlRESIZE(). rxlEXTEND() has no memory of which cells it has populated around the formula. When it recalculates it uses the MaxRows and MaxCols arguments to decide the size of the range it needs to clear.

WARNING: If you use =rxlEXTEND(array, -1 , -1) then it will clear EVERY cell below and to its right.

 rxlEXTEND() Limitations and Features

  • Excel 2007 or later
  • Windows operating system
  • Function must be outermost one in the formula
  • If worksheet contents are protected then the relevant cells must not be locked
  • Treat it like an array formula: avoid using whole row or column references whenever possible. The existing code can be optimised and is on my to-do list
  • You must manually clean up residue values when reducing the row/column limits. This is because it uses the MaxRows and MaxCols to determine which cells it needs to clear. If you reduce them in the formula then it won’t clear data in the wings of the old range. The same is true if you delete or clear the cell containing the formula.
  • Should not be used with automation. The formulas will not recalculate while other VBA code is running. Setting Application.EnableEvents to False in your automation code will cause the functions to return an error if you have the short-circuit calculation loops option ticked (see Ribbon section)
  • Should be used with caution on worksheets with Change event handlers
  • No allowance has been made for merged cells
  • Cannot be used with volatile functions (including in precedent cells)
  • If a volatile formula sits on the same worksheet then that formula will calculate twice
  • Should only be called from a single cell formula
  • Destructive: it will overwrite existing data in the worksheet except for pivot tables and tables

The Ribbon

The Formulas tab on the ribbon offers three formula extensions options:

  1. Short Circuit Calc Loops is a setting that I recommend you keep ticked. This option means that the rxlRESIZE() and rxlEXTEND() functions will detect if you’ve created an infinite calculation loop and will short-circuit. Calculation loops could be created through using volatile functions, circular references, and so on. This setting requires Application.EnableEvents to be set to True for the functions to work.
  2. Reset Used Range is a setting which means that the rxlRESIZE() and rxlEXTEND() functions will reset the used range on a sheet when they calculate. If you have a large sheet with lots of data then this could really slow the extension functions down so you should generally leave this option unchecked. However, if you have a sheet where you’ve recently cleared a lot of data then Excel might be confused and still think that the sheet has a large used range which will slow the resize functions down: ticking this option will fix that.
  3. Show Recent Errors will show a modeless userform with any recent errors which couldn’t be returned to the worksheet. The extension functions will return errors to the worksheet whenever possible but, because of their nature, sometimes they can’t. If they can’t return an error to the worksheet then they put it on the userform for you to see.

Finally…

Last, but not least, a thank you to my alpha testers:  Rory Archibald, Zack Barresse, Jon Von Der Heyden and Andy Pope. I contacted these guys a while ago and asked them for feedback on some pretty rough-looking, proof of concept code. They were a great help and spent quite a bit of time digging around for problems and giving me suggestions.

Advertisements

About Colin Legg

RAD Developer Microsoft MVP - Excel 2009 - 2014
This entry was posted in Microsoft Excel, Microsoft Office and tagged , , , , . Bookmark the permalink.

15 Responses to Self Extending UDFs – Part 1

  1. Is it a good Idea to let UDF’s do what macros are supposed to do?

    Like

    • Colin Legg says:

      I knew someone would ask about that! :-)

      Your concern about UDFs being able to modify the Excel environment is a valid one.

      The short answer is: generally no, it would be a bad idea but, in this case and with the notes/context I already put in the article, yes, I think it is a good idea.

      I’ll write up a full post on this question on the weekend.

      Like

      • I’d read that. I have a question though. Can you write a procedure to identify all the array formulas and resize them?

        Perhaps something that involves specialcells and hasarrayformula.

        Like

      • Colin Legg says:

        You could, but how would you determine how large the array range should be? It could get ugly.

        I mentioned in passing in my post that there are utilities out there such as technicana’s which will ease the the pain of resizing array formulas. But they don’t have the built-in intelligence or convenience of these functions: the user still has to define the size themselves.

        Like

      • When I use Array Formulas, I expect them to be a certain size, depending on the number of rows in my dataset. And I would just use the Current Region property to determine the size. If the user does not know the size in advance, then this UDF works best. Although, if all the user wants is to not see the error, they could always check for errors and return blanks, instead of letting VBA control EXCEL.

        Also, I am not very thrilled about having to change the range I am referring to within the function because I would have to change it when my dataset changes. You could argue that using a Dynamic Name (using offset and countif would solve the issue, but that is too cumbersome.

        I reckon, pertaining to how I set up my models, I would approach this differently. I would use a CurrentRegion based solution, that loops through all the columns, identifies if there is an array formula; “remembers” it, and extend/reduce the size accordingly.

        Like

      • Colin Legg says:

        When I use Array Formulas, I expect them to be a certain size, depending on the number of rows in my dataset. And I would just use the Current Region property to determine the size.

        If you have a trivial situation where an array formula needs to match the size of the source data, the source data is static and that source data happens to be in a worksheet, then that’s fine. But that isn’t what this blog post is about. Don’t be mislead by the simple examples I gave to demonstrate how these UDFs work: I had to keep them as straightforward as possible so readers would understand the basic functionality.

        If the user does not know the size in advance, then this UDF works best.

        Exactly. In my world I have to deal with database APIs a lot. Frequently these are exposed as compiled worksheet functions where I pass in some parameters and I get an array back. The array can dramatically vary in size depending on the parameters and can be very large and cumbersome. A lot of the time there’s no way of knowing what that size will be: it’s this sort of situation where these functions come into their own. I did express this in my blog post.

        I reckon, pertaining to how I set up my models, I would approach this differently. I would use a CurrentRegion based solution, that loops through all the columns, identifies if there is an array formula; “remembers” it, and extend/reduce the size accordingly.

        That isn’t a viable option in the contexts I’ve described.

        Liked by 1 person

  2. I guess we cant check for errors and return blanks like i mentioned earlier, we can however, do a conditional format to set the font colour to cell fill colour if there is an error.

    Like

    • While that’s certainly a viable workaround, it’s just that – a workaround, and is another added layer of complexity and maintenance. These functions are somewhat specific in their context, but can be extremely valuable if used properly. OTOH, if used improperly, they could be quite confusing as well.

      Overall I think Colin has done a wonderful job of writing some really robust UDF’s here. And the great thing about Excel is there’s always 101 ways to do the same thing, some better, some worse.

      Liked by 1 person

  3. Colin Legg says:

    Hi Zack, there were definitely some potential showstoppers on this one which took quite a bit of work to fix, so it’s great to finally have it on the blog. Thanks again for your input on the alpha testing.

    Like

  4. Pingback: Self Extending UDFs – Part 2 | RAD Excel

  5. Jamie Beard says:

    Colin this is outstanding. I didn’t know that this is possbile from VBA.

    I don’t understand much of ur code though. Please would you explain what IExtension does?

    Like

  6. Pingback: Extracting unique data from Large Datasets - Excel Magic Trick 698

  7. Pingback: Daily Dose of Excel » Blog Archive » Why not let users trigger macros from a UDF?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s