The power of the Query Editor in Power BI is more than that it allows you to load data from a variety of sources, transform that data to meet your analysis needs, create new columns of calculated results and define special views of the data. One of the things that I am impressed with is that it will remember all of the steps you put into getting your data ready for use and it can rerun all those steps the next time you reload raw data from your data source. So before we roll into a new year and begin looking at Power BI Desktop’s big brother, PowerBI.com, let’s take a look at the Power BI Desktop Query Editor.
In the past, when I first opened Power BI, I would immediately select the Get Data button from the Home ribbon. What if instead I immediately clicked on the Edit Queries button in the External Data group.
The Query Editor opens its own window with 4 tabs: Home, Transform, Add Column, and View. Looking at the Home ribbon, you should note that many of the most commonly used actions appear on this menu as well as on the other ribbons. At this time, there does not appear to be a way to customize the actions that appear on this and the other ribbon, but as with many things Power BI, that may come in time.
The first think I need to do is to open a data set to work with. When I click on Get Data, I get the same drop down of data source options that I would see from the Home ribbon of the Power BI Desktop screen.
In this case, I am going to load my standard Contoso dataset including the FactSales table with related product and date data. I will reference my local host SQL Server. Of course, you may have to reference a network SQL server. I generally do not enter the database name on the first dialog, but select the database from the second screen. Note that on the second screen, I can just click the down arrow to the left of the database name to open the database and display a list of the tables within it. This is in my mind a lot easier than spelling the name of the database correctly before selecting the tables I want to work with.
Because I am using a SQL Server database, I have the option of either importing the data or using DirectQuery which does not download a copy of the data to my local data model, but rather reads the data directly from the SQL Server instance.
Because I selected 5 tables from the database, the Query Editor list 5 queries along the left side of the screen, a preview of the data in the center and the query properties and any steps applied to the data on the right side of the screen.
I can click on any of the five queries to switch the one that I want to work on. In the above image, I elected the DimDate query. Now suppose that I want to make some changes to this data. Perhaps the first thing I want to do is to eliminate all the fiscal calendar fields from the table. One way to do this is the select the columns I want to remove. I can click on the first column and then while pressing the CTRL key click on each of the other columns I want to remove. I can also click on the first column I want to move and if the columns I want to remove are sequential, I can move to the last column I want to remove while press the SHIFT key and click on the column header of that last column.
Still on the home ribbon of the Query Editor, I can click on the Remove Columns button to get rid of the columns I do not want to keep in my final model. I notice that as I do this, another step appears in the Applied Steps section in the right column.
Next I want to add a new column to the date table that I will use to sort the months chronologically rather than alphabetically. That column will simply consist of the month number (January =1, February = 2, etc.). I can do this by clicking on the Add Column tab and selecting the Add Custom Column from the General group. Note the option Add Index Column. This option allows you to add a sequential number in a new column for each row that either begins with a ‘0’ or a ‘1’. You can also add a custom index that begins with any number and has a custom increment which may be different from ‘1’. This could be an interesting way to add a surrogate key to a table when you have to merge data from two difference sources (maybe domestic and international) into a single table for reporting purposes and you still need create a ‘new’ index because the domestic index and the international index may have overlapping values.
When I choose to add a custom column, the Query Editor displays a dialog to let me define the name for column as well as the formula. All of that is well and good, but the Query Editor provides an easier solution. Instead of clicking the Add Custom column, note that the Add Column ribbon has sections for working with text, numbers and dates. In this case, all I need do is to select the column [DateKey] and then click on the lower portion of the Date button to display the menu of options. If I select Month, I get a second set of options of different calculations related to month. The first option actually creates the new column and calculates the month number while the other options return the first day of the month and the last day of the month as date or the number of days in the month as an integer. That is pretty cool too.
Now the default name for the column is just Month. However, I can easily change that by right clicking on the column header and selecting Rename from the dropdown menu and entering the column name I want.
As I said before, each of these steps have been recorded in the Applied Steps section to the right of the screen. If I make a mistake or want to try something different, all I need do is click the ‘X‘ to the left of the step name to remove that step and then try something else. Remember that I am only working with a preview of the data at this point. I have not downloaded the data yet.
I could proceed with other data changes for this query as well as the other four queries, but I hope you get the idea that you can perform any transformation, create new columns, etc. that you may want. I will come back in the future to discuss some of the other features of the Query Editor that could save you time.
Let’s assume that I am done making changes to my data. It is now time to load the data from SQL Server into my model. From the Home ribbon of the Query Editor, I click the button: Close & Apply.
This will close the Query Editor window and begin loading the data from SQL Server into my local in-memory data model using the transformations in the applied steps of each query. This is the secret to how to minimize the data you bring into your model from a data source. Any excess columns will be removed. You can even get rid of specific rows (like we did in PowerPivot to eliminate some of the product categories when we loaded data many weeks ago). You can also create new columns, rename new or existing columns, and basic structure the data they way you want to use it. All of these transformations will happen as the data is loaded into your local data model. Then each time you refresh your data, these same query steps are performed consistently with each refresh.
Well, it is the beginning of a new year, at least a new calendar year. I think it is time that we move on to the big brother of PowerBI, PowerBI.com next time so that we can show how and when each tool should be used and how they can work together.
C’ya next time.