Trapping Errors in DAX

Several weeks ago I talked about finding errors in expressions because of the way DAX uses the operators in an expression to automatically convert values to either strings or numbers. I also looked at using the BLANK() function to find and account for missing values in data. But then I got sidetracked by preparing to go to the SQL PASS Summit. So now I want to finish this group of entries by examining how to trap errors and then either correct them or at the very least provide a value that will not cause additional calculation problems.

The first function I want to look at is the IFERROR() function. This function only has two parameters. The first parameter is the expression to test. If the expression evaluates without an error, IFERROR() returns that calculated value. However, if the expression results in an error, the function returns the second parameter. How does this work. Suppose you have a simple expression such as the following for a calculated column named DiscountAmount representing the discount amount of the item:

= PURCHASEORDER[ListPrice] * PURCHASEORDER[DiscountPercent]
In most cases, this expression would evaluate correctly. Even if the DiscountPercent or ListPrice values were left blank, we learned when discussion blank data that DAX assumes a value of '0' which in this case could still be considered correct. However, what if one of the values in the DiscountPercent column was not numeric and not a blank? What if the value was a string such as a DiscountPercent value being 'NA'? Then DAX would register an error for the calculated column and not display any of the calculated values. Resolving this problem by visually examining the dataset may work when you have a small dataset with only a few records because you can scroll through the data and quickly find the string value that caused the problem. However, if you have millions of data rows, finding the cause of the error may be just a bit more difficult and time consuming. With DAX we can easily solve this problem by using the IFERROR() function in the expression as shown below:
= IFERROR( PURCHASEORDER[ListPrice] * PURCHASEORDER[DiscountPercent], 0)

In this case the IFERROR() function tests the value of the expression and if an error occurs because one or both values in the expression cannot be converted to numeric values, it traps the error and returns the value of the second parameter, rather than returning the product of the expression in the first parameter. On the other hand, returning a value of ‘0’ may not be what you want. After all, a value of ‘0’ implies that no discount is applied. However, in this case, we really don’t know why the discount amount is not calculated, only that it results in an error. Therefore, you may want to return a blank value as in the following expression.

= IFERROR( PURCHASEORDER[ListPrice] * PURCHASEORDER[DiscountPercent], BLANK() )

Using the IFERROR() function works best if the tested expression is also the expression that you want to return if the test expression does not generate an error. But what if the test expression is different from the expression that should be used when the test expression is true? In this case you want to use the ISERROR() function to test for an error. The following expression tests for an error when multiplying the ListPrice times the DiscountPercent in table PURCHASEORDER.

= ISERROR( PURCHASEORDER[ListPrice] * PURCHASEORDER[DiscountPercent] )

By itself, this expression just returns a value of TRUE or FALSE. To use this in an expression that returns something other than just the product of these two values, I will use the IF() function to define what value to return based on the Boolean result of the ISERROR() function. If ISERROR() returns a value of true indicating an error, the IF() expression returns the value of the second parameter. Otherwise, it returns the value in the third parameter.

= IF(ISERROR( PURCHASEORDER[ListPrice] * PURCHASEORDER[DiscountPercent] ), 0, PURCHASEORDER[ListPrice] * (1 - PURCHASEORDER[DiscountPercent]) * PURCHASEORDER[Quantity] )

Testing your expressions may seem much more complex than just entering the expressions you want to calculate. You may also think that performing these tests against every row of a large table must also degrade performance, and you would be correct. In practical use, I do not include a lot of test functions such as those described here unless I encounter a problem with a calculated column or suspect that the data in the column may not be clean. If you cannot trust your data sources, then you really don’t have much choice but to test. Perhaps an alternate approach to adding IFERROR(), ISERROR(), or IF() functions in the calculations is to perform data cleansing first using the DQS tool that is now part of SQL Server. Of course setting up a DQS knowledgebase and then using it to find and correct data issues can also be time consuming. You may even want to consider the application that captures the data to tighten the validation rules that check the data entered into that system trapping errors as they are entered rather than trying to identify and fix them later.

C’ya next time.

Advertisements

What Have I Been Up To?

Last week I was at the PASS Summit in Charlotte, North Carolina.  It was my second time attending the summit and also my second time speaking at the summit.  I think they said that this was something like the 15th year for the summit.  There is nothing like getting together with a couple thousand other professionals who have professional and/or personal interests in the same things you do. It is an opportunity to talk with collegues that you may only communicate with through email or twitter throughout the rest of the year.  It is also a chance to reconnect with existing friends and to make new friends who share the same interests as you.

This year, the number of sessions dedicated to business intelligence and the Microsoft BI stack has grown.  Part of that is that the BI stack itself has expanded with everything from the tabular model in SSAS to augment the multidimensional model to Data Quality Services (DQS), Master Data Services (MDS), Power Pivot, Power View, Power Map, Power Query and Power Q&A.  (If you would like to try some of the latest members of the BI stack, go to: http://office.microsoft.com/en-us/excel/power-bi-FX104080667.aspx ).  Of course, PASS also supports a separate BI conference that is scheduled for the spring.  However, the interest in business intelligence was especially interesting to me because of my personal interest in the entire stack of tools since the introduction of PowerPivot several years ago.

Personally, I’m looking forward to one more SQL Saturday this year, the Tampa SQL Saturday, BI Edition (http://www.sqlsaturday.com/248/eventhome.aspx ) which happens on November 9th.  I will also be speaking there so if any of you have an interest in BI topics, come out to Tampa in a little over 2 weeks and say, ‘Hi!’.  The BI edition of SQL Saturday was founded in Tampa several years ago by Jose Chinchilla and I’m glad to have been a part of each subsequent annual event.

Other events that are coming up are the monthly meetings of both Orlando SQL groups.  The south group, MagicPASS (http://magicpass.sqlpass.org/ although I don’t think the web site has been updated), will be meeting in Celebration, FL on Wednesday, October 23rd and the north group, OrlandoPASS (http://orlando.sqlpass.org/ ), will be meeting in Sanford, FL on Thursday, October 24th.  If you are more into SharePoint, there is a meeting of the Tampa SharePoint group on Tuesday, October 22nd at 6:00 PM to 8:30 PM at the Microsoft Office, Tampa (https://www.eventbrite.com/event/8853808981/).  It is a busy week as you can see.

The other news that you may have already heard is that the CTP2 of SQL Server 2014 was released last week during the PASS Summit.  I have not looked into 2014 yet so I don’t have a recommendation one way or the other.  However, if you have processes that would benefit from improved performance, the new Hekaton engine in SQL 2014 may be able to provide some performance improvement with very little effort (read that as no program changes).  During the Summit, the Thursday keynote was presented by David DeWitt who tried to explain the technology behind how Hekaton achieves its performance gains.  Most people left the keynote with their heads hurting trying to understand the magic behind the technology.  However, it seems like the magic might be summarized by the combination of the elimination of latches and locks thus reducing the amount of contention when accessing data along with efficiencies with the new column store method allowing more data to be read into memory reducing disk access for many operations.  Of course that may be too simple and probably incomplete.  Afterall, my head started to hurt as well.

Well, that’s it for this week.  I guess it was more of a summary of what I’ve been up to and why I did not have much time to post any blogs for the last two weeks.  I’m sure my text here does not even begin to do justice to the excitement of the summit,  so perhaps it would be better if I just left you to go out to the summit site and watch some of the interviews and keynote talks here: http://www.sqlpass.org/summit/2013/PASStv.aspx.

C’ya next time.