No matter whether you are working with PowerPivot or the Tabular model of SQL Server Analysis Services, you should be familiar by now with creating both calculated columns and calculated measures. At least that is true if you have reading my blog for the past several months. You may be confused as to when to use a calculated column versus a calculated measure. In some cases, they can yield the same results. However, in other cases, the difference between the two can create very different results. In fact, in some cases, no obvious error occurs, but the resulting values are very, very wrong.
For this case, I am going to use a very simple set of data rather than the full Contoso data set in order to make the calculations easier to understand and why some choices between calculated columns and calculated measures can critically affect your results. Notice in the table below that there are only 8 records, 4 records for each of two years. Each of the four records in each year represent the sales of a fictitious company through 4 channels (Catalog, Online, Reseller, and Stores).
For the first calculation, I will calculate the profit of each channel in each of the years (row context) using the formula:
=[TotalSales] – [TotalCost]
(Notice in this case I omitted the table reference since there is only a single table in my model.)
As you can easily verify, the total profit for stores in 2011 is $400,000 which is the result of subtracting $1,600,000 from a total sales of $2,000,000. Similarly, you can validate the other calculated rows. The point is that as a calculated column, Power Pivot performs the calculation once for each row of the table in which I define the calculation. The resulting values become a part of the model and are only calculated once when the column is created and then stored in the model. However, as you can imagine, a table with millions of rows of data could take a while to calculate. The following figure shows the calculated TotalProfits column for my 8 row data table. Note that in 2012, the total profit for store sales was also $400,000 but this profit was on a larger total sales amount meaning that the profit margin was less.
Now suppose I created a calculated measure instead of a calculated column as shown in the following figure. In this case, the formula for the calculated measure MyTotalProfit is:
=SUM(Sales[TotalSales]) – SUM(Sales[TotalCost])
The main difference between a calculated measure and a calculated column is that the actual calculation takes place when the calculated measure is added to the pivot table not before. This is because the context of the calculation is based on the filters in the current table. Remember that each row, column, slicer, or report filter is a filter for each cell displaying the calculated measure.
While our data is relatively small, this means that our calculation of MyTotal Profit occurs 5 times. One time for each of the sales channels in the final pivot table, and one time for the grand total. However, you can quickly see that no matter whether we sum the calculated column for TotalProfit over the years for each sales channel or use the calculated measure, we get the same result. The reason is because of the following mathematical fact:
(A-D) + (B-E) + (C-F) = (A+B+C) – (D+E+F)
In this expression, the left side of the equation represents the sum of the calculation column for TotalProfit while the right side of the equation represents the calculated measure which subtracts the sum of costs from the sum of sales.
So in this case, the two methods, calculated column and calculated measure, yield the same result. Which method results in the better performance would depend on the total number of rows in the table and the filters applied to the final pivot table.
Now suppose instead of looking at total profit as a number, we want to look at percent margin. Afterall, a large dollar profit on an item that sells extremely well may not return as much profit per dollar sold as an item that sells less, but at a high profit per sale. Let’s first see what happens when we calculate the calculated column PercentMargin using the formula:
= (Sales[TotalSales] – Sales[TotalCost]) / Sales[TotalSales]
As you can see in the following figure, the percent margin (displayed as a decimal value here) can readily be validated using the TotalSales and TotalCost columns. For example, in 2011, sales at stores resulted in a 20% margin while sales at stores in 2012 resulted in only a 18.1818% margin on each dollar sold.
We could also calculate a calculated measure called MyPercentMargin using a similar formula, but summing total sales and total costs first before calculating the margin. The expression in this case is:
(SUM(Sales[TotalSales]) – SUM(Sales[TotalCost])) / SUM(Sales[TotalSales])
As you can see in the following figure, this formula results in a total profit margin of about 22.2589% which looks reasonable considering that the individual channel profit margins for each of the two years ranges from 18.1818% to 28.5714%.
Adding the calculated column profit margin and the calculated measure profit margin to a simple pivot table shows the results in the following table.
The first thing to note is that in this example the values in the last two columns are different. Simply looking at the percent values may not in this case tell you which column is correct. However, your first hint is that the Grand Total for the Sum of PercentMargin column is much larger than any of the individual columns. This is not mathematically possible. The overall percent margin for all sales must be between the lowest and the highest percent margin of any of the detail rows. This makes the Sum of PercentMargin column suspect. On the other hand, the grand total for the MyPercenMargin column does fall between the range created by the lowest and highest percent margin of the individual rows. With the help of a calculator, you can quickly see that the correct value for the percent margin of store sales is:
((2,000,000 + 2,200,000) – (1,600,000 + 1,800,000)) / (2,000,000 + 2,200,000)
This verifies that the calculation in the MyPercentMargin column is probably correct.
Why the difference? The key is that I introduced division into the calculation. As long as I was just adding and subtracting values for the calculated column, I could use the distributed property to create a measure that performed the same task. However, by introducing division to calculate margins, I cannot simply sum the individual profit margin percentages of each channel for each year. Rather I must calculate the total sales and total costs first and perform the margin calculation on these totals.
So what is the general rule? If your calculated column formula merely adds or subtracts values, you can typically use either a calculated column or calculated measure. However, as soon as you introduce multiplication or division, you might think that you must use a calculated measure in order to get the correct result as shown in this example. Well, that is not quite true either. Next time, I’ll show an example in which the calculated column provides the correct result and the calculated measure does not.
C’ya next time.