Using Process Monitor With Power Query

Chris Webb has a couple interesting posts on using Process Monitor, a venerable sysinternals tool, to troubleshoot issues with Power Query performance.  First, Chris looks at how often Power Query opens and closes a file for data processing:

Troubleshooting Power Query performance issues in Power BI and Excel can be difficult because it’s a bit of a black box: there’s nothing in the UI to tell you what’s going on inside the Power Query engine and the diagnostic logs are very difficult to interpret. With relational data source like SQL Server you can use tools like SQL Server Profiler to see the queries that are being run by Power Query, and I blogged recently about using Fiddler to troubleshoot OData performance issues; but what about file-based data sources, which often present the most challenges regarding performance?

Process Monitor, a free tool from Microsoft, allows you to monitor file system activity in real-time and even having spent a limited amount of time using it I can already tell that it can provide a lot of information to help identify performance issues with file-based data sources. Take, for example, the scenario I described in my recent post on improving the performance of merge operations. In that post (which I suggest you read before you carry on) I mentioned that it looked as though the Power Query engine was reading data from one of the source files multiple times and Process Monitor confirms that this indeed the case.

Then he shows us just how much data Power Query is reading each time it opens the file:

This post is really just a quick follow-on from my post earlier this week on using Process Monitor to troubleshoot Power Query performance issues with file-based data sources, which I suggest you read before carrying on. I realised, after playing around with Process Monitor some more, that the ReadFile operation actually tells you how much data is being read from a file when a Power Query query is running. For example, here’s a sample of some of the ReadFile operations captured while running the unoptimised version of the query I talked about in my last post:

I get the feeling that really learning a few sysinternal tools can be extremely useful, like how learning Wireshark can make you a better DBA.

Related Posts

Linear Regression With Python In Power BI

Emanuele Meazzo builds a linear regression in Power BI using a Python visual: As a prerequisite, of course, you’ll need to have python installed in your machine, I recommend having an external IDE like Visual Studio Code to write your Python code as the PowerBI window offers zero assistance to coding. You can follow this article in […]

Read More

Replacing Bidirectional Filters with Visual Filters

Alberto Ferrari shows how you can replace bidirectional filters with visual filters in Power BI and improve visual quality: The noticeable thing about the behavior of the slicer is that the two matrices are showing only the brands and colors purchased by Amanda. Yet, the Color slicer is still showing all the colors, even though […]

Read More

Categories

July 2018
MTWTFSS
« Jun Aug »
 1
2345678
9101112131415
16171819202122
23242526272829
3031