Press "Enter" to skip to content

Category: Power BI

Finding Disproportionate Outliers With DAX

Rob Collie uses DAX to find disproportionate sentiment on a per-group basis:

My wife loves to travel AND she loves data, so it’s no surprise that she showed me the infographic in question.

“Oh come on, California residents don’t visit the Philippines more often than any other country.”  That was my first reaction.  (Mexico is the overwhelming #1 destination, basically for every state… because Cancun.  And because there are so many Mexican-Americans).

“Ah…  but California residents DO visit the Philippines disproportionately more often than they ‘should,’ according to national averages!”  That was my dawning second realization (and confirmed by the fine print of the Orbitz article, even though the article’s title suggests otherwise.)

So, how do we do that in DAX?  Pretty simply, actually.

I’m mentally working out whether this could be useful in anomaly detection.

Comments closed

Dual KPI Custom Visual

Adam Saxton has a quick video demonstrating a dual KPI custom visual:

The Dual KPI efficiently visualizes two measures over time. It shows their trend based on a joint timeline, while absolute values may use different scales, for example Profit and Market share or Sales and Profit.

Each KPI can be visualized as line chart or area chart. The visual has dynamic behavior and can show historical value and the change from the latest value when you hover over it. It also has small icons and labels to convey KPI definitions and alerts about data freshness.

I looks cool, but I dunno; my philosophy is that man cannot serve two KPIs.

Comments closed

Gap Analysis Custom Visual

Devin Knight continues his Power BI custom visuals series:

In this module you will learn how to use the Gap Analysis Power BI Custom Visual.  The Gap Analysis visual is used to analyze the difference between two different groups of data you have.  For example, you might use it to analyze the gap between two answers people gave in survey response data.

I like the gap analysis visual; it works well as a cross-category comparison visual, giving you an idea of the relative importance of each category as well as the change from one time period to the next.  It’s a good way of fitting two useful pieces of information into the same visual.

Comments closed

Custom Power BI Shapes Using R

Koen Verbeeck uses R to create dynamically changing images in Power BI:

You can insert images into Power BI Desktop, but these are static images. If you want them to dynamically change, you need the Image Viewer custom visual. Unfortunately, it doesn’t support measures, only columns. Since we want dynamic changes, fixed column values are not going to work. Someone proposed a work around on the Power BI forums, but this only works if you have a fixed set of attributes you want to slice on (for example, 4 categories). I want a totally flexible solution (e.g. each month we have a couple of new weeks to slice on), so again, not possible.

The only solution I could think of that would still work: using R visuals.

Read on for the solution.

Comments closed

Excel Data Cleansing

Cedric Charlier continues his series on fixing up an Excel file.  First up is turning results into an enumeration:

We’ve previously decided that a DON’T KNOW, shouldn’t influence our average of the answers. To apply this decision, we just need to filter the table Result and remove all the values equal to 0 (Enum value of DON’T KNOW). Then we calculate the average and subtract 1 to get a value between 0 and 4. Coooool, except that if we’ve no value non equal to 0, it will return -1 … not what we’re expecting. We’ll need to validate that the average is not null before subtracting 1.

The next post involves converting respondent information into a dimension:

In this table, only the results with a QuestionId equal to 111 really interest me for a merge with the existing table Respondent. If you’re familiar with the UI of Power BI Desktop then you’ll probably think to create a new table referencing this one then filter on QuestionId equals 111and finally merge. It’ll work but applying this strategy could result in many “temporary” tables. A lot of these small tables used only for a few steps before merging with other tables tend to be a nightmare for maintenance. You can use the “Advanced formula editor” to not display this kind of temporary tables and embed them in your main table.

Read on for more.

Comments closed

Anti-Slicers In Power BI

Nicolo Grando shows how to create anti-slicers in Power BI:

A normal slicer can be tedious when you want to show everything apart from just one or two entries in your filtered tiles – don’t take your finger off the Ctrl key! You could always turn on Select All, then unselect the items. But you may not want Select All enabled, and it’s not available for chiclets. Or you could use Visual/Page/Report level filters, but these are not available in dashboards or publish-to-web. So you may be interested in an anti-slicer? There are many ways to do this, this is one approach.

This is an interesting problem to solve, and I think this is a nice solution.

Comments closed

Appending Data In Power BI

Ginger Grant shows that data sets don’t need to be exactly the same for Power BI to combine their contents:

Recently I worked on a Power BI project where I needed to merge data provided in spreadsheets. The spreadsheets came from different vendors and while they contained mostly the same data, the columns were not in the same order. I wanted all of the data to reside in one table. In Query, that means that I wanted to Append the data. The files which I were merging were very wide, and I missed the fact until after I was done that some of the columns were in different order. Power BI is smart enough to figure out the order on its own. I didn’t need to change the order of the columns at all, as long as they have the same column names. Here’s an example using three different files.

That’s a sign of a smart tool.

Comments closed

Hierarchical Data Cleansing With Power BI

Cedric Charlier has started a series on dealing with hierarchical data in a not-so-hierarchical format:

To load this Excel file in Power BI, I’ll just use standard functions and define a first table “Source” that won’t be enabled to load in report.

My next task will be to create a table (or dimension) with the different questions. I also want to include a hierarchy in this dimension: I should be able to browse the questions by categories and sub-categories.

Let’s create a new table named “Question” by referencing the “Source” table. Then remove the other columns than A and B.

The curse of Excel is that it’s so easy to build a data set in strange ways that make it hard to integrate later.

Comments closed

SSRS Log File Location Change

Wolfgang Strasser points out that SSRS log files are in a new directory structure for vNext:

The log files can be found in the Logfiles directory (it was the same directory also for the older versions). In SSRS vNext there more different log files..

The logging information seems to be splitted into multiple log files – if you want for example dig into the Power BI on-premises logging I propose to have a look at the RSPower*.log files.

This happens every once in a while, so it’s good to know when the log files move somewhere else.

Comments closed

Sampling Data Lake Data

Alex Whittles shows how to use U-SQL to sample data to read in Power BI:

The answer is sampling, we don’t bring in 100% of the data, but maybe 10%, or 1%, or even 0.01%, it depends how much you need to reduce your dataset. It is however critical to know how to sample data correctly in order to maintain a level of accuracy of data in your reports.

Option 1: Take the top x rows of data
Don’t do it. Ever. Just no.
What if the source data you’ve been given is pre-sorted by product or region, you’d end up with only data from products starting with ‘a’, which would give you some wildly unpredictable results.

Option 2: Take a random % sample
Now we’re talking. This option will take, for example 1 in every 100 rows of data, so it’s picking up an even distribution of data throughout the dataset. This seems a much better option, so how do we do it?

Read on for a couple of sampling methods.

Comments closed