Press "Enter" to skip to content

Category: Visualization

Smoothing and its Inherent Risks

John Mount would like you to take care when using smoothers:

Here is a quick data-scientist / data-analyst question: what is the overall trend or shape in the following noisy data? For our specific example: How do we relate value as a noisy function (or relation) of m? This example arose in producing our tutorial “The Nature of Overfitting”.

One would think this would be safe and easy to asses in R using ggplot2::geom_smooth(), but now we are not so sure.

Here’s a quick summary of my general philosophy: the data are more interesting than a smoothed line. I’m okay putting in a smoothed line to help a reader make sense of a trend, but I wouldn’t want to have a plot with just the smoothed line. Read the whole thing from John to get well beyond my rule of thumb.

Comments closed

Visualization and the Value of Expectations

Alex Velez thinks about violating expectations in visuals:

This isn’t to say we should never deviate from normal graphing conventions, but we should have a good reason for doing so—a reason that makes up for any unintended consequences. 

What other design decisions might also take our audience by surprise—going against normal graphing expectations? I’ll outline a few. 

Click through for examples. One thing not explicitly brought up is that we follow conventions to reduce the amount of thought needed to understand something. For circumstances in which there’s a major benefit, you might want to run that risk. Also, there’s an argument in here that, at some point, it’s better to have something radically different than marginally different.

Comments closed

All About Dot Plots

Cole Nussbaumer Knaflic talks about one of my favorite plot types:

The term “dot plot” can be used for any graph that is encoding data in a dot or small circle. There are a few common types that I’ll focus on here. If you’ve ever asked yourself—What is a dot plot? How do I interpret a dot plot? When should I use a dot plot? or What are pros and cons of dot plots?—you’ll find the answers in this post. I’ll also share some tips on creating them and where to find examples that will inform and inspire.

Read the whole thing.

Comments closed

Waterfall Visuals

Mike Cisneros takes us through cases when waterfall charts are useful:

In our workshops, we often put a grid of a dozen charts up on the screen, and say to the participants, “Most of the charts you’ll need to communicate effectively in business are right here on the screen. 99% of the time, one of the visuals you see here will get your message across effectively. And as you can see there aren’t any really unusual charts here. You’ve probably seen all of these before.” 

If, at this point, somebody in the room says, “Actually, I’ve never heard of a ______ chart before,” you can almost always fill in the blank with the word “waterfall.”

Waterfall charts are really useful in a few scenarios, but I see them get misused far too frequently.

Comments closed

Combining Two Survey Questions into a Graph

Stephanie Evergreen solves a challenge:

You’ve asked employees to rate a bunch of different aspects of their job. You want to know if they think that aspect is important AND how satisfied they are with that aspect of their job. So, naturally, you make two individual questions with response options like Not at all Important to Very Important and Not at all Satisfied to Very Satisfied. I would probably do the same thing.

But then you’ve got to show the data and, importantly, how those two variables – Importance and Satisfaction – relate to each other.

Click through for two methods of visualizing the results.

Comments closed

Working with Network Graphs in R

John MacKintosh shows us the visNetwork package:

I’ve long been hoping for a reason to have to devote time to learning how to produce network plots. In my world, where bar and line charts reign supreme (with heatmaps and waffle charts thrown in occasionally) it is nice to be able to develop a new visualisation.

I’ve been wanting to produce a network plot for some time. But, the data structure, with its nodes and edges, and seeming lack of any identifiable characteristics, has meant it has not been hugely far up my agenda, or at least, never far up enough to make me learn more about it.

Click through for an example of where a network diagram can work out. H/T R-Bloggers

Comments closed

Thoughts on Trendlines

Alex Velez shares some thoughts on trendlines:

A trendline is a line drawn on a chart highlighting an underlying pattern of individual values. The line itself can take on many forms depending on the shape of the data: straight, curved, etc. This is common practice when using statistical techniques to understand and forecast data (e.g. regression analysis). Determining the best fit and forecasting is beyond this article’s scope, so if you’re interested in learning more, I recommend Anna Foard’s Stats Ninja website. Instead, I’ll focus on various considerations related to visualizing trendlines when communicating data.

My main thought on trendlines is that they are less important than the data points. We make up the trendlines out of thin air; the data points actually exist and actually matter. Trendlines can be useful, but they don’t replace the data.

Comments closed

Alternative Ways of Displaying Heatmap Data

Cole Nussbaumer Knaflic gives us a couple alternatives to displaying data in a heatmap:

I often describe heatmaps as a good means for getting an initial view of your data. They can help you start to explore and understand where there might be something interesting to highlight or dig into. But once you’ve identified the noteworthy aspects of your data, should you use heatmaps to communicate them?

As often is the case, it depends.

If you are communicating to an audience who likes to see data in tables—applying heatmap formatting can provide a visual sense of the numbers without fully changing the approach (or having it feel like you’ve taken detail away). If you know your stakeholders will want to look up specific numbers (particularly in the case where different stakeholders will care about different numbers) and then understand them in the context of the broader landscape, a heatmap may also work in this scenario.

Click through for some ideas.

Comments closed