Chris Webb shows us another way to optimize Power BI merge performance:
The SortMerge algorithm, last in the list above, is the focus of this blog post. I mentioned in my earlier posts that the reason that merge operations on non-foldable data sources are often slow is that both of the tables used in the merge need to be held in memory. There is an exception though: if you know that the data in the columns used to join the two tables is sorted in ascending order, you can use the Table.Join function and the SortMerge algorithm and the data from both sources can be streamed rather than held in memory, which in turn results in the merge being much faster.
That’s the same in the relational world: merge joins are the fastest, assuming that your data is pre-sorted in the proper manner.
Comments closed