I was testing some calculations in my tabular model when I noticed that my Lost Customers calculation wasn’t working as expected. This was rather interesting to me since the calculation I was using was from DAXPatterns.com. After some experimentation, I determined that the difference between my data situation, and the situation that fit the DAX Pattern was that my customer dimension was a Type 2 Slowly Changing Dimension. That meant I couldn’t use the customer key to identify a unique customer, since each customer could have multiple customer keys (surrogate keys) with the same customer number (business key). Specifically, if a customer made a purchase in December 2015, then changed their name, then made a purchase in January and February, my calculation was counting them as lost because it was not recognizing that the customer with the new name and same customer number was actually the same customer.
She ends up with two solutions, each with different trade-offs. Knowing as little DAX as I do, looking at two different ways of solving this problem is great because it gives you more insight into language semantics.
With both of these concepts combined, the file size was reduced from the original 264 MB to 238 MB, a reduction of almost 10%. You can see where the space savings have come from by comparing the before and after column sizes in the 2 tables below. The SalesValueExTax column (65MB) was replaced with the Margin column (44MB) and the CostValue column (63MB) was replaced with the CostPerCase column (50MB).
Check it out, as well as the memory tool.
The difference here is that the Month Number and Month Name fields are both present – they have to be since the query has to sort by Month Number. In MDX the order of members on a hierarchy can be set inside the model; in a DAX query you can only sort using an ORDER BY clause and for that to work, the field you’re ordering by must be present in the query.
This kind of feels like an edge case, but if you run into it, it’s good to know that it’s not a bug.
Average of best 8 scores from last 20 rounds
So the requirement is to find the average of the best 8 scores from the last 20 rounds of golf for each player. So if you think about that problem, there are quite a few layers to it – perfect for a blog on how to break a problem into pieces so you can solve it in DAX.
Even if you do nothing with DAX, read over the post because the problem-solving technique Allington uses is generally applicable.
As you can see, using DAX variables is a much better solution than using the aliases…the performance improvement is about the same, however, variables we can wrapped up in the calculated measures inside the model allowing us to take advantage of the performance gain with all tools (not just those allowing us to hand-craft the DAX queries).
The query used in this post is too simple to highlight the performance benefit (small data dataset, simple calculation)…but it did make it easier to cruise the query plans and SE requests. In reality, a better use case for highlighting the performance benefits of these optimizations is with a query that hammers the Formula Engine (FE).
Interesting stuff, even for someone with no knowledge of DAX.