Press "Enter" to skip to content

Category: Power BI

Choosing an Algorithm for Table.Join in Power Query

Chris Webb continues a series on optimizing merge performance in Power Query:

The first thing to say is that if you don’t specify a join algorithm in the sixth parameter of Table.Join (it’s an optional parameter), Power Query will try to decide which algorithm to use based on some undocumented heuristics. The same thing also happens if you use JoinAlgorithm.Dynamic in the sixth parameter of Table.Join, or if you use the Table.NestedJoin function instead, which doesn’t allow you to explicitly specify an algorithm.

There are going to be some cases where you can get better performance by explicitly specifying a join algorithm instead of relying on JoinAlgorithm.Dynamic but you’ll have to do some thorough testing to prove it. From what I’ve seen there are lots of cases where explicitly setting the algorithm will result in worse performance, although there are enough cases where doing so results in better performance to make all that testing worthwhile.

That behavior is the same as if you decided to start writing INNER LOOP JOIN or INNER HASH JOIN for your queries. In the right spot, you may have knowledge the optimizer doesn’t have and can make performance faster, but a lot of the time the approach will be a bit too heavy-handed and end up a net degradation of performance.

Comments closed

Analysis versus Reporting and Power BI

Rob Collie thinks about industry movement between analysis and reporting. Part one gives us some backstory:

Excel was about to make a large investment in BI-related capabilities, and the powers that be had selected me to lead our part in it. I was excited, but now I needed a crash course in “what the hell is BI?” I was given multiple tutors, and they all were quick to introduce the concept of Analysis versus Reporting. The “versus” seemed to be pretty important. It wasn’t an “and” – no, the “versus” was chosen deliberately in these sermons. You see, these were Two Very Different Things.

I struggled mightily to grasp this difference. I was told that interactive things like PivotTables were Analysis tools – NOT Reporting tools! Reports were something completely different. “But,” I pointed out, “they’re called ‘Insert PivotTable Report’ on the Excel menu today!” (This was Excel 2003). “Yeah,” said the mentors, “…we might want to fix that.”

Part two explains why analysis and reporting are both important:

Another “meta characteristic” of paginated reports is that they TEND to display details rather than aggregations. EX: specific transactions rather than emergent trends. In paginated reports, you’re MORE likely (but not guaranteed!) to be looking at “raw” rows of data from the original database, whereas in a Power BI report, you’re more likely (but again, not guaranteed!) to NOT be seeing raw individual rows, but rather intelligent aggregations of MANY rows. But either way, more detail means you’re more likely to need multiple pages.

Rob’s right on the money. And I’m looking forward to part three of the series.

Comments closed

Incremental Refresh of Any Power BI Data Source

Gilbert Quevauvilliers wants incrementally to refresh all the sources:

The pattern that I am talking about is the following below, which will be used as my example below.

– Connect to a data source which can query fold.
> In my example I have installed and configured an Azure SQL Serverless DB
> In this database I have a date table.
– Configure date table to use Incremental refreshing as per the blog post
> Incremental refresh in Power BI
– Create a function which will then use the Date value as part of the parameter
> In my example I have got CSV files which have Exchange rate information from Azure Blob Storage.
> The file names of the CSV files is the date.
– Invoke the Function within the date table to extract the required information.

I know what you might be thinking, that as soon as I add in the column with the function it breaks the Query Folding. That is what I thought too.

The great news is that Incremental refreshing DOES STILL WORK!

Read on for the demonstration.

Comments closed

Creating Data-Driven Power BI Report Subscriptions

John White shows how to create a data-driven subscription for a Power BI report:

One of the features that has never made the leap from SQL Server Reporting Services (SSRS) on-premises to the cloud is data-driven subscriptions. Users can subscribe to reports, but a data-driven subscription allows individual subscriptions to be stored in a central location and parameterized, while delivering the reports to multiple locations. This article will describe a pattern for accomplishing this using SharePoint lists as the subscription store, and Power Automate as the automation tool, for a no-code solution to this requirement.

The other alternative would be to use Power BI Report Server, but if you’re not using that, this is an interesting approach and solution.

Comments closed

A Power BI FAQ

James Serra answers questions about Power BI:

Should we have dev, test, and prod workspaces?

Yes! You should use change management to move reports through the dev/test/prod workspace tiers via the new deployment pipelines in Power BI. Use the workspaces to collaborate on Power BI content with your colleagues, while distributing the report to a larger audience by publishing an app. You should also promote and certify your datasets. The reports and datasets should have repeatable test criteria.

Read on for the full set of questions and answers.

Comments closed

Improving Power BI Performance

Dan Szepesi continues a series on Power BI performance tuning:

As an example, I am going to go through in detail how to use the results from the Performance Analyzer to understand the performance of your visuals.  I downloaded the sample PBIX from the Power BI Documentation at Microsoft.com – https://docs.microsoft.com/en-us/power-bi/create-reports/sample-datasets and I will use the visuals from the Net Sales report in the screenshots that follow.

I am going to walk through how I would approach looking at the performance of this visuals on this report and show what we can learn from the data that the Performance Analyzer gives me.

Click through for that example as well as several helpful tips.

Comments closed

Returning Multiple Values in Power BI with ConcatenateX

Nick Edwards shows how you can use the ConcatenateX DAX function to combine values:

In this blog post we’ll take a quick look at using ConcatenateX function to view a concatenated string of dates where the max daily sales occurred for a given month.

I came across this function whilst going through the excellent “Mastering DAX 2nd Edition Video Course” by the guys from SQLBI.com. So credit to Marco and Alberto for sharing this.

So how does it work? If we had a list of dates ranging from 01/01/2020 to 31/12/2020 and we wanted to see which days we achieved maximum sales for each given month in a year we could use the ConcatenateX function to return these dates in a single row per month.

Click through for the demo.

Comments closed

Power BI Best Practice Tips

Lazaros Viastikopoulos continues a series on Power BI tips, switching from performance to best practices:

Tip 2) Organise Measures by Grouping

Tip number two goes hand in hand with the tip explained above, as after we generate explicit measures, what should we do with all those leftover implicit measures? Surely they will confuse the report authors if they are left visible. Furthermore, if we structure our data model as a Star Schema, every fact table will contain some foreign keys to establish a relationship with the primary key in the dimension (lookup) table. Should these columns remain visible for everyone to use?

Read on to learn how, as well as details for the other four tips.

Comments closed

Organizing and Optimizing Power BI Dataflows

Marc Lelijveld wants your Power BI dataflows in tip-top shape:

In this blog I describe a few of the challenges you might face when you have a lot of dataflows. I will describe a few tips and tricks I am applying to sanitize your dataflow approach, organize dataflows and easy to browse through.

Dataflows are increasingly used as shared resource or staging layer inside the Power BI platform. While having dataflows, you can push down logic and reuse across different datasets. This lowers the impact on the source by extracting the data once from source to Power BI, helps in centralizing logic, having one version of the truth and lots of other advantages.

Read on for some tips and practices.

Comments closed

Optimizing Power BI Merge Performance with Table.Join

Chris Webb shows us another way to optimize Power BI merge performance:

The SortMerge algorithm, last in the list above, is the focus of this blog post. I mentioned in my earlier posts that the reason that merge operations on non-foldable data sources are often slow is that both of the tables used in the merge need to be held in memory. There is an exception though: if you know that the data in the columns used to join the two tables is sorted in ascending order, you can use the Table.Join function and the SortMerge algorithm and the data from both sources can be streamed rather than held in memory, which in turn results in the merge being much faster.

That’s the same in the relational world: merge joins are the fastest, assuming that your data is pre-sorted in the proper manner.

Comments closed