Statistics: PowerPivot Style, Part 4: The RANKX Function


Last time I showed you how to sort the students within a grade by their total scores.  I showed that you can sort by any column displayed in the pivot table using the Sort dialog of the Student Name row labels.  This method requires that I include the Sum of StudentScore measure in the pivot table.  I cannot sort by a column not included in the pivot table.  I also explained that you could use the new Sort by Column feature to define the sort order.  The advantage of this method is that you do not have to include the sort by column in the final pivot table.  The problem with this method is the inability to define whether the sort is ascending or descending (or at least I have not yet found it.  If you have please add it to the comments section.)  There is one other way that I’ll cover today to sort the students by their scores.  That is by using the RANKX function.  But first, here again as a reference is the list of students and their total student score calculated last time.

 

The first parameter must either be a table or be a DAX function that returns a table.  In this example, since I want to calculate the student’s rank across all classes, I choose to use the expression ALL(Students).  This expression returns a table that represents all of the students in the school, not the student represented by the specific row where I want to do the calculation.

You might question why I use a function that returns all students in the school and not a function that would return all students in a class.  First, you have to think about the fact that the order of students in any grade when ranked by all students in the school will be the same order of students if the table only included students within a specific grade as long as all of the classes in the grade have the same total possible point score.  The advantage of using the ALL(Students) expression means that I can use this expression in a pivot table that uses slicers to show different grade levels without affecting the calculation expression.  Otherwise, I would need separate student rank expressions for each grade level.

Suppose you have the following grades for only 6 students and then rank the student scores across all students.

Student

Total Score

Grade

Rank

Student 01 632 Grade 1 2
Student 02 583 Grade 1 5
Student 03 599 Grade 2 4
Student 04 654 Grade 2 1
Student 05 610 Grade 1 3
Student 06 512 Grade 2 6

 

Now if you only wanted to look at the second grade students, you could filter on this table to get:

Student

Total Score

Grade

Rank

Student 03 599 Grade 2 4
Student 04 654 Grade 2 1
Student 06 512 Grade 2 6

 

As you can see, the order of students defined from the highest score to the lowest for second graders is still Student 04, Student 03, and finally Student 06.  This is true even though there is no second grade student with a rank of 2, 3, or 5.  Therefore, ranking all of the students and then filtering by grade gives us the same student order as filtering by grade and then ranking the selected students.

The RANKX function lets you define a DAX expression that it can evaluate to a scalar value for each row in the designated table.  The thing that is special about this is that the expression can reference data in a related table.  In fact, I will use this capability in a moment, but first, let me explore the other features of the RANKX function.

The third parameter lets you supply a value and determine the rank of that value based on the entire list of values.    I’ll come back to this later after I calculate the student rank to show how this works.

The fourth parameter let you define the direction of the sort order.  By default, if this value is not entered or if a value of zero (0) is provided as the fourth parameter, the scores will be ranked in descending order.  In other words, the highest scores will be at the top which is what I saw in the Sort by Column feature.  While this may be exactly what I want here, I could also provide a value of one (1) for this parameter to sort the scores in ascending order with the lowest score at the top.

The fifth parameter lets me decide how to deal with ties.  By default, all rows with the same value will have the same rank.  Let’s suppose I had 1 student with the total score of 635 giving them a rank of 1 and 4 students with the total score of 632.  Each of these four students would have the rank 2.  However, the next student with a score of 631 would have the rank 6.  The values 3, 4, and 5 are skipped because these three students are tied with the student in the rank 2 and there is no way to position any one of these students ahead or any other.  You can also supply a value of ‘Skip’ for this parameter.

On the other hand, if we specify a value of ‘Dense’, the student with a score of 635 is still ranked ‘1’, each of the next 4 students with a score of 632 are ranked ‘2’, but the next student with a score of 631 is ranked ‘3’.  There is no skipping of ranked values.

Finally, parameters 2 through 5 can be repeated to defined a hierarchy of ranking students that can be used to break ties by using other fields (expressions).

So returning to my example, want to use the following expression to rank my students in descending order by their total score.

=RANKX(ALL(Students),SUMX(RELATEDTABLE(Assignments3),Assignments3[AssignmentScore]))

This gives me the result in the following figure.

To rank my students, I can now select the StudentName column and click on the Sort by Column button in the Sort and Filter group of the Home tab to define my StudentRank column to determine the sort order of the student names.  Moving to the Excel pivot table, I can display just the student names without any other numeric data and the student names will appear in descending sorted order by their total scores.  Furthermore, by adding a grade slicer, I can see the students ranked by grade as shown in this figure.

 

As a final word on this topic, suppose I wanted to find out what rank a student would be if they had a score of 584.  I could created a calculated value in the calculation area with this expression:

TestRank:=RANKX(ALL(Students), SUMX(RELATEDTABLE(Assignments3), 
             Assignments3[AssignmentScore]),584)

In the following figure, you can see that I have a StudentScore value of 585 with a rank of 53 in a descending sort.  Since a score of 584 is lower than 585, I would expect the rank to be after 53 and as you can see in the figure, the calculated value of 54 shows this to be true.

Well, I suppose that is enough for today.  C’ya next time as I start to take a look at some of the date functions in DAX.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s