Press "Enter" to skip to content

Category: Power BI

Using Process Monitor With Power Query

Chris Webb has a couple interesting posts on using Process Monitor, a venerable sysinternals tool, to troubleshoot issues with Power Query performance.  First, Chris looks at how often Power Query opens and closes a file for data processing:

Troubleshooting Power Query performance issues in Power BI and Excel can be difficult because it’s a bit of a black box: there’s nothing in the UI to tell you what’s going on inside the Power Query engine and the diagnostic logs are very difficult to interpret. With relational data source like SQL Server you can use tools like SQL Server Profiler to see the queries that are being run by Power Query, and I blogged recently about using Fiddler to troubleshoot OData performance issues; but what about file-based data sources, which often present the most challenges regarding performance?

Process Monitor, a free tool from Microsoft, allows you to monitor file system activity in real-time and even having spent a limited amount of time using it I can already tell that it can provide a lot of information to help identify performance issues with file-based data sources. Take, for example, the scenario I described in my recent post on improving the performance of merge operations. In that post (which I suggest you read before you carry on) I mentioned that it looked as though the Power Query engine was reading data from one of the source files multiple times and Process Monitor confirms that this indeed the case.

Then he shows us just how much data Power Query is reading each time it opens the file:

This post is really just a quick follow-on from my post earlier this week on using Process Monitor to troubleshoot Power Query performance issues with file-based data sources, which I suggest you read before carrying on. I realised, after playing around with Process Monitor some more, that the ReadFile operation actually tells you how much data is being read from a file when a Power Query query is running. For example, here’s a sample of some of the ReadFile operations captured while running the unoptimised version of the query I talked about in my last post:

I get the feeling that really learning a few sysinternal tools can be extremely useful, like how learning Wireshark can make you a better DBA.

Comments closed

Tableau And Power BI: Complementarities

Jen Stirrup argues that there is a good reason to use both Tableau and Power BI in the same environment:

A question I often here is this: Which tool should I use, Tableau or Power BI? The truth is: They are not mutually exclusive.

Tableau is great at business mysteries: ill-defined questions where you have to surf the data for results. Power BI is particularly great at modelling and cleaning the data, with clean, crisp data visualisation and the ability to use custom and open-source data visualizations. This blog isn’t aimed at the technical user, but at the analyst who needs to get information out quickly. I will do another post, aimed at the geeks, another time.

I am all about embracing the power of And.  The downside is, my corporate budget went bankrupt just thinking about the cost of both of them together…

Comments closed

Power BI And Many-To-Many Joins

Teo Lachev notes a big improvement to Power BI’s data modeling story:

The July 2018 preview of Power BI Desktop delivers two killer preview features that solidify the Power BI position as the best data modeling tool on the market. First, Microsoft relaxes the Power BI relationship limitations by letting you create M:M joins between two tables. Second, you can now create a composite (hybrid) data model with different storage modes, such as from an SQL Server database configured for DirectQuery and from an imported text file. Without reiterating the documentation, here are some important notes to keep in mind based on my pre-release testing.

But read the whole thing, which includes some limitations around hybrid data models.

Comments closed

Parsing Power BI Logs With Power BI

Kellyn Pot’vin-Gorman uses Power BI to analyze Power BI logs for issues:

Do I really want to go through all this data by hand?  BI is a reporting tool, so what if I bring them into Power BI?  Let’s start with the first MS Mashup Container log-

In Power BI, click on Get Data –> Text and change the file type to “All Files” in the explorer and go to the directory that contains the trace files:

C:\Users\<user>\AppData\Local\Microsoft\Power BI Desktop\Traces\Performance

Remember that you will need to have “hidden items” set to be displayed to browse down to this folder.  Choose the files you wish to load in the directory and Power BI and choose a Customer delimiter of a quotes, (“) to separate the file.  This will load a file that will have a few columns you’ll need to remove that contain data like colons, nulls and other syntax from the file

I like it.

Comments closed

Power BI Log Files

Kellyn Pot’vin-Gorman shows us how to access Power BI log and trace files:

We’ve now identified the process, the amount of memory allocated to perform a task captured in the log, start time and the duration.  The information in these log files can assist when diagnosing if Power BI desktop crashes, but the data collected is quite rudimentary.

If you shut down Power BI Desktop, the PBIDesktop* log file writes to the startup file, which was once empty and it then empties and saves off the timestamp of the exit of the program.

The Microsoft Mashup file has much of the same information, but includes deeper level processing work by Power BI, such as work done in the Query Editor or when we create a measure or new column/table.

In the three examples from the file below, you can see a compile, a save and then an evaluate task.

There’s some useful information here for debugging.

Comments closed

Power BI Helper 3.0

Reza Rad has a new version of Power BI Helper out:

It is a pleasure to announce the newest version of Power BI helper, version 3.0 July 2018 with the great feature of exporting model documentation. The documentation part of the insight from Power BI Helper has been always in our backlog, but haven’t had a chance to work on it. Gladly now you can export the document to an HTML file. The exported documentation at the moment, has information about all tables in the model, all measures, all columns and calculated columns in each table with possible expressions and descriptions. If you like to learn more about Power BI Helper, read this page.

Read on to see what Power BI Helper has in terms of documentation.

Comments closed

Relationships In Power BI

Teo Lachev shows us the importance of defining relationships in Power BI:

However, If there isn’t a direct relationship between ResellerSales and Employee, the moment you add an unsummarized field from the second table on the many side, such as Employee[FullName] after adding SalesTerritoryCountry and ResellerSales[SalesOrderNumber), you’ll get the error “Error: Can’t determine relationships between the fields”.

Solution: Interestingly, the report works fine if a summarized field, such as COUNT(Employee[EmployeeKey]) is used. In this case, the SalesTerritory dimension acts as a conformed dimension joined to two fact tables. The reason why it doesn’t work when Employee[FullName] is added is because there is no aggregation on the Employee table and the relationship between ResellerSales[SalesOrderNumber] and Employee[FullName] becomes Many:Many over SalesTerritory which is now a bridge table. One employee may be associated with multiple sales and a sale can be associated with multiple employees. How do we solve this horrible problem?

Good data modeling is important, and Power BI dashboards are no exception to the rule.

Comments closed

Calculating Future Revenue Streams With Phasing

Matt Allington walks us through a somewhat complicated scenario with income stream projections but inconsistent monthly payout rates:

The first thing I decided to do was work out which was the first month of the policy.  I used LOOKUPVALUE to do this because the Calendar table was not connected to the policy table.  Instead of returning the actual month name, I returned the unique MonthID from my calendar table.  I ALWAYS load a unique MonthID in my calendar tables as they are REALLY useful. In short this column is an integer, starting at 1 and incrementing by 1 for each month in the table.  So starting in Jan, after 3 full years the MonthID will be 36.

Pol First Month = LOOKUPVALUE (
     'Calendar'[Month Index],
     'Calendar'[Date],
     MAX ( Policies[Pol Date] )
)

After writing this measure (and all subsequent measures), I placed the measure in the pivot table shown below.  This pivot table has the policy date from the policy table on rows and the YYMM from the Calendar table on columns.  I did this for a very important reason. I knew this would be the way I needed to see the final results and therefore I needed to make sure all measures returned the expected outcome once this layout was used.  And don’t think I wrote the all the formulas correctly to achieve this outcome the first time.  In fact I made plenty of mistakes.  But the point is I wouldn’t have known they were mistakes if I didn’t place the result in a pivot table like the one below.

It’s a good example of breaking apart a fairly complicated concept and rethinking equations as set operations in DAX.

Comments closed

Sorting When Your Measure Is Not In The Visual

Kasper de Jonge shows us different ways of sorting a visual by some unrelated measure:

So lets start with the simple one, I want to sort a chart on a measure not part of the visual. Let’s take this visual:

Now instead of sorting by OrderQuantity I want to sort by the ListPrice. The trick here is to make the measure part of the query, and one way you can do that is by adding it to the tooltip

Read on for examples for charts as well as matrices.

Comments closed

Using DAX With Reporting Services

David Stelfox gives us an example of using DAX when connecting SQL Server Reporting Services to a SQL Server Analysis Services Tabular model:

Tools like Power BI have changed reporting allowing power users to leverage tabular cubes to present information quicker and without the (perceived) need for developers. However, experience tells us many users still want data in tables with a myriad of formatting and display rules. Power BI is not quite there yet in terms of providing all this functionality in the same way that SSRS is. For me, SSRS’s great value and, at the same time its curse, is the sheer amount of customisation a developer can do. I have found that almost anything a business user demands in terms of formatting and display is possible.

But you have invested your time and money in a tabular SSAS model which plays nicely with Power BI but your users want SSRS reports so how to get to your data – using DAX, of course. Using EVALUATE, SUMMARIZECOLUMNS and SELECTCOLUMNS you can return data from a tabular model in a tabular format ready to be read as a dataset in SSRS.

It’s a good post and a good example.  The only quibble I have is in the motivating paragraph; Power BI and SQL Server Reporting Services have different end goals—Power BI isn’t (and I think never will be) a pixel-perfect report building product; it’s meant to be a dashboarding technology.  That quibble aside, the example is well worth checking out.

Comments closed