Dendrograms are not THE most common qualitative visual because they require a data generated through a hierarchical cluster analysis. Cluster analysis can be a useful tool in analyzing qualitative data. By clustering groups of participants with similar qualitative codes, you can better understand your findings. According to Henry & team, this analysis can help “reveal things like participant motive and the reasons behind counterintuitive findings.”
Check out Henry’s article to learn more about the analysis. Here, let’s just focus on describing a dendrogram that could display those hierarchical cluster analysis findings. They can be a little confusing at first, especially since the x-axis has 100% closest to the y-axis when we aren’t used to seeing it that way. Walk through this example with us.
Click through for an example. If it’s confusing at first, read to the end, as I think the concrete example helps everything click.
If you are a BI/report developer, you know this challenge well. You may follow all the guidelines: choose a good color palette, make visuals that highlight the important data points, get rid of clutter. But what happens when your data refreshes tomorrow or next month or next year? It’s much easier to make a chart with static data. You can format it so it communicates exactly the right message. But out here in Automated Reporting Land, that is not the end of our duties. We have to make some effort to accommodate future data values.
Meagan uses the Phone-a-Friend option and gets an interesting inversion of the normal solution.
That ball of mush in the middle is hard to look at, but the smaller disconnected bits aren’t! Just like Ben, I want to work on those smaller pieces too! And just like the lonely tables we looked at last week, these small isolated components are also good candidates for extracting from SQL Server.
The script looks at joins in execution plans, which is a rather clever way of doing this when you don’t have a comprehensive set of foreign key constraints.
One of the Power BI improvements in the March 2019 Desktop release was reduced bubble size for the Map visual. I previously wrote about the benefit of the reduction in point/bubble size. I was unaware until recently that this change made it into more than the Map visual.
The ability to reduce the point size also appears in the Format options for the Power BI Scatter chart. Previously, you could change the size option from 0 to 100 under the Shapes area. As with the Map, the Scatter now allows you to reduce the size as low as -30. I did not see this mentioned in the March Desktop blog post. I must have missed it if it was part of a previous month’s release. In any case, if you were not aware that you could set the point size from -30 to 100with the Scatter chart, now you do.
For most scenarios, I think the dot size is probably a little too big. -30 is generally too small, but I’m happy that they offer us options to get it right.
The first step in any type of analysis is to understand the dataset itself. A Databricks dashboard can provide a concise format in which to present relevant information about the data to clients, as well as a quick reference for analysts when returning to a project.
To create this dashboard, a user can simply switch to Dashboard view instead of Code view under the View tab. The user can either click on an existing dashboard or create a new one. Creating a new dashboard will automatically display any of the visualizations present in the notebook. Customization of the dashboard is easily achieved by clicking on the chart icon in the top right corner of the desired command cells to add new elements.
This isn’t quite a step-by-step guide but does spur on ideas.
The high level process is to:
1. Create a measure that returns a colour as the result
1. It can be a word, such as blue, red, green
2. It can be a hex code for a colour, like #40E0D0″, “#FFA07A”
2. Use conditional formatting and use the measure to apply the formatting on the text as a rule.
Read on for a demo.
What about a separate Power BI Date table?
This setup is built for consistency of comparison. As people go deeper into Power BI, they typically add a separate Date table as part of a more robust data model and add relationships between tables. At the same time, they disable the default Auto Date/Time built-in hierarchies. This more advanced setup with a separate Date table allows several conveniences as well as performance and storage benefits. It’s especially true with larger models that include many facttables that each join to Date and other possible dimension tables. Tableau doesn’t currently have a comparable data model. We’ll stay conveniently away from that setup in Power BI because we only have one simple sample table.
I think both of them make this an easy operation, though Tableau is probably easier here.
As someone very interested in storytelling, ggplot2 is easily my data visualization tool of choice. It is like the Swiss army knife for data visualization. One of my favorite features is the ability to pack a graph chock-full of dimensions. This ability is incredibly handy during the data exploration phases. However, sometimes I find myself wanting to look at trends without all the noise. Specifically, I often want to look at very dense scatterplots for outliers. Ggplot2 is great at this, but when we’ve isolated the points we want to understand, we can’t easily examine all possible dimensions right in the static charts.
Enter plotly. The plotly package and ggploty function do an excellent job at taking our high quality ggplot2 graphs and making them interactive.
Read on for several quality, interactive visuals.
ggplot– You can spot one from a mile away, which is great! And when you do it’s a silent fist bump. But sometimes you want more than the standard theme.
Fonts can breathe new life into your plots, helping to match the theme of your presentation, poster or report. This is always a second thought for me and
needto work out how to do it again, hence the post .
Read on to see how to use each of these packages. H/T R-bloggers
As you can see, there are data labels for each subcategory (means gender and education), but no data label showing the total of each education category. for example, we want to know how much was the total sales in the High School category. Now that you know the problem, let’s see a way to fix it.
Read on for Reza’s solution to the problem. In general, if people might care about the total, do them a favor and show the total.