Qualitative Analysis with Dendograms

Stephanie Evergreen explains what denodgrams are and when they can be useful visuals:

Dendrograms are not THE most common qualitative visual because they require a data generated through a hierarchical cluster analysis. Cluster analysis can be a useful tool in analyzing qualitative data. By clustering groups of participants with similar qualitative codes, you can better understand your findings. According to Henry & team, this analysis can help “reveal things like participant motive and the reasons behind counterintuitive findings.”

Check out Henry’s article to learn more about the analysis. Here, let’s just focus on describing a dendrogram that could display those hierarchical cluster analysis findings. They can be a little confusing at first, especially since the x-axis has 100% closest to the y-axis when we aren’t used to seeing it that way. Walk through this example with us.

Click through for an example. If it’s confusing at first, read to the end, as I think the concrete example helps everything click.

Telling a Story When Data is Always Changing

Meagan Longoria explains how you can tell a story with data when the data is not static:

If you are a BI/report developer, you know this challenge well. You may follow all the guidelines: choose a good color palette, make visuals that highlight the important data points, get rid of clutter. But what happens when your data refreshes tomorrow or next month or next year? It’s much easier to make a chart with static data. You can format it so it communicates exactly the right message. But out here in Automated Reporting Land, that is not the end of our duties. We have to make some effort to accommodate future data values.

Meagan uses the Phone-a-Friend option and gets an interesting inversion of the normal solution.

Finding Dependency Clusters

Michael J. Swart performs cluster analysis with tables:

That ball of mush in the middle is hard to look at, but the smaller disconnected bits aren’t! Just like Ben, I want to work on those smaller pieces too! And just like the lonely tables we looked at last week, these small isolated components are also good candidates for extracting from SQL Server.

The script looks at joins in execution plans, which is a rather clever way of doing this when you don’t have a comprehensive set of foreign key constraints.

Shrinking Dot Sizes in Power BI

David Eldersveld shows how we can reduce the point size of dots in POwer BI as of March 2019:

One of the Power BI improvements in the March 2019 Desktop release was reduced bubble size for the Map visual. I previously wrote about the benefit of the reduction in point/bubble size. I was unaware until recently that this change made it into more than the Map visual.

The ability to reduce the point size also appears in the Format options for the Power BI Scatter chart. Previously, you could change the size option from 0 to 100 under the Shapes area. As with the Map, the Scatter now allows you to reduce the size as low as -30. I did not see this mentioned in the March Desktop blog post. I must have missed it if it was part of a previous month’s release. In any case, if you were not aware that you could set the point size from -30 to 100with the Scatter chart, now you do.

For most scenarios, I think the dot size is probably a little too big. -30 is generally too small, but I’m happy that they offer us options to get it right.

Databricks Dashboards

Megan Quinn takes us through building dashboards with Apache Zeppelin on Databricks:

The first step in any type of analysis is to understand the dataset itself. A Databricks dashboard can provide a concise format in which to present relevant information about the data to clients, as well as a quick reference for analysts when returning to a project.

To create this dashboard, a user can simply switch to Dashboard view instead of Code view under the View tab. The user can either click on an existing dashboard or create a new one. Creating a new dashboard will automatically display any of the visualizations present in the notebook. Customization of the dashboard is easily achieved by clicking on the chart icon in the top right corner of the desired command cells to add new elements.

This isn’t quite a step-by-step guide but does spur on ideas.

Conditional Formatting on Text Fields in Power BI

Matt Allington shows how you can apply conditional formatting to non-numeric fields in Power BI:

The high level process is to:
1. Create a measure that returns a colour as the result

1. It can be a word, such as blue, red, green
2. It can be a hex code for a colour, like #40E0D0″, “#FFA07A”
2. Use conditional formatting and use the measure to apply the formatting on the text as a rule.

Read on for a demo.

Running Totals in Tableau and Power BI

David Eldersveld shows how to create running totals in both Tableau and Power BI:

What about a separate Power BI Date table?
This setup is built for consistency of comparison. As people go deeper into Power BI, they typically add a separate Date table as part of a more robust data model and add relationships between tables. At the same time, they disable the default Auto Date/Time built-in hierarchies. This more advanced setup with a separate Date table allows several conveniences as well as performance and storage benefits. It’s especially true with larger models that include many facttables that each join to Date and other possible dimension tables. Tableau doesn’t currently have a comparable data model. We’ll stay conveniently away from that setup in Power BI because we only have one simple sample table.

I think both of them make this an easy operation, though Tableau is probably easier here.

Interactive ggplot Plots with plotly

Laura Ellis takes us through ggplotly:

As someone very interested in storytelling, ggplot2 is easily my data visualization tool of choice. It is like the Swiss army knife for data visualization. One of my favorite features is the ability to pack a graph chock-full of dimensions. This ability is incredibly handy during the data exploration phases. However, sometimes I find myself wanting to look at trends without all the noise. Specifically, I often want to look at very dense scatterplots for outliers. Ggplot2 is great at this, but when we’ve isolated the points we want to understand, we can’t easily examine all possible dimensions right in the static charts.

Enter plotly. The plotly package and ggploty function do an excellent job at taking our high quality ggplot2 graphs and making them interactive.

Read on for several quality, interactive visuals.

Custom ggplot2 Fonts

Daniel Oehm shares two techniques for using custom fonts in your ggplot2 visuals:

ggplot – You can spot one from a mile away, which is great! And when you do it’s a silent fist bump. But sometimes you want more than the standard theme.

Fonts can breathe new life into your plots, helping to match the theme of your presentation, poster or report. This is always a second thought for me and need to work out how to do it again, hence the post.

There are two main packages for managing fonts – extrafont, and showtext.

Read on to see how to use each of these packages. H/T R-bloggers

Showing Totals on Power BI Stacked Column Charts

Reza Rad shows us how to add a totals figure to Power BI stacked column charts:

As you can see, there are data labels for each subcategory (means gender and education), but no data label showing the total of each education category. for example, we want to know how much was the total sales in the High School category. Now that you know the problem, let’s see a way to fix it.

Read on for Reza’s solution to the problem. In general, if people might care about the total, do them a favor and show the total.

Categories

May 2019
MTWTFSS
« Apr  
 12345
6789101112
13141516171819
20212223242526
2728293031