Press "Enter" to skip to content

Category: Visualization

All about Boxplots

Amy Esselman explains what a boxplot is:

The “box” part of a boxplot outlines the lower and upper quartiles. Inside the box is a line that indicates the median value. There are lines that extend outside the box—known as the whiskers—to depict the range of values in a given dataset. If there are outliers, then individual dots in line with the whiskers are plotted to denote the extreme values. 

Click through for a depiction of the plot as well as several alternative depictions which can include more information at the cost of added complexity.

Comments closed

Plotting Multiple Columns on a Legend in Power BI

Jason Cockington has a workaround:

At a recent training course, one of the students asked if it was possible to add two different columns on the legend of a line chart, so that when a selection is made on a second slicer the chart splits to reveal multiple lines.

Given others in the class showed interest in the subsequent conversation, I decided to create a short blog so that everyone could benefit.

The short answer is “no” but the longer answer is more interesting.

Comments closed

Visualizing High-Density Regions with R

The rOpenSci team covers the history of the gghdr package:

This was how being a newcomer to rOpenSci OzUnconf 2019 felt. It was incredible to be a part of such a diverse, welcoming and inclusive environment. I thought it would be fun to blog about how it all began, and the twists and turns we experienced along the way as we developed the gghdr package. The package provides tools for plotting highest density regions with ggplot2 and was inspired by the package hdrcde developed by Rob J Hyndman. The highest density region approach of summarizing a distribution is useful for analyzing multimodal distributions and can be composed of numerous disjoint subsets. For example, the histogram of the highway mileage (hwy) data from the mpg dataset (a) shows that cars with 6 cylinders (cyl) are bimodally distributed, which is reflected in the highest density region (HDR) boxplot (c) but not in the standard boxplot (b). Hence, we see that HDRs are useful in displaying multimodality in the distribution.

Read on for a short history of an interesting package.

Comments closed

Verbalizing a Chart

Alex Velez reminds us of the spoken side of communication:

I’m confident that I could overcome some of these design challenges by effectively explaining the graph to someone else. Will it be a perfect data communication? No—but sometimes, we have to deal with less-than-ideal circumstances like time limitations, or not having control over our designs. Knowing how to verbalize a graph can be a practical solution when faced with these constraints.

I should caveat this by clarifying that my intention is not to say that we shouldn’t spend time on our visualizations. But too often, we focus only on the visual. We believe that a graph or a picture is worth a thousand words. Or maybe we assume that because we created the chart, we will automatically know how to talk through it. I am super guilty of this!

Read on for some tips on vocalizing a visual.

Comments closed

Building a Simple Streamlit App

I jump into a new web framework:

In the course of working on my book, I wanted to build an easy-to-use website for outlier detection. The idea here is that I have a REST API to perform the outlier detection work but I’d like something a little easier to read than JSON blobs coming out of Postman. That’s where Streamlit comes into play.

Click through to see how it all works. I was impressed with how easy it was to build a decent interactive website.

Comments closed

Making a Scatter Plot in Excel

Mike Cisneros shows how to create a nice-looking scatter plot in Excel:

Scatter plots are excellent charts for showing a relationship between two numerical variables across a number of unique observations. We see them in business communications from time to time, although they’re much more commonly used in the “exploration” part of the process—when we’re still trying to understand our data and find the important insights. 

If you’re unfamiliar with scatter plots, their common use cases, or their benefits and drawbacks in a range of scenarios, check out the what is a scatter plot? article in our SWD Chart Guide. There, we explore some of the basics of scatter plots via an example, share tips for designing them more effectively, and discuss common variations (bubble charts, connected scatter plots, and more).

Read on for the process, which can be a lot more difficult than you may first expect.

Comments closed

Kibana Dashboards on Azure Data Explorer

Guy Reginiano has an announcement for us:

Elasticsearch and Kibana users can now easily migrate to Azure Data Explorer (ADX) while keeping Kibana as their visualization tool, alongside the other Azure Data Explorer experiences and the powerful KQL language.
A new version of K2Bridge (Kibana-Kusto free and open connector) now supports dashboards and visualizations, in addition to the Discover tab which was previously supported.

Click through to see how it works. I’m not the world’s biggest fan of Kibana by any stretch of the imagination but it’s nice to have this ability.

Comments closed

Making an R Box Plot from a Picture

Tomaz Kastrun builds a plot:

We create a raster image from a picture and calculating the ratio of the pixels on the scale of grayscale. The more the darker colour is represented in the pixels, the bigger the value. And this value is converted into the vector of values. And each vector is represneted as a violin boxplot.

Click through for an example.

Comments closed