Press "Enter" to skip to content

Category: Visualization

Equidistant Points and Missing Data in Excel

Stephanie Evergreen shows how you can bring in missing data points in Excel to ensure the axis is accurate:

Excel automatically spaces your intervals and labels equidistant from one another but it is assuming that your intervals actually are equidistant. In this graph, that’s not the case. We are missing the months of March, April, July, and August, when either no one was enrolled in the study or we have some missing data. But we can’t just gloss over those months. It isn’t truthful and it distorts the data display.

Click through for the solution.

Comments closed

Bar Chart Presentation Options

Andy Kirk gives us five techniques for gussying up bar charts:

“Bar charts are boring”, say many people. “How can we make them more attractive”, say many desperate clients. Bar charts are ubiquitous because they are the reliable and trusted lieutenants often relied upon to show the always-common quantitative comparisons across different categories. Their frequent use can induce ‘boredom’ through the familiar but, in particular, accusations of inelegance can be raised with the default tool styles many creators lean on.

The five charts below just offer some different ways you might style them, through variations in the use of functional colour properties, chart apparatus and layout decisions, in particular. The reasons why you would choose to use any of these methods are varied and especially contextually dependent, based on matters like space to work in, range of quantitative values, size of category labels, number of bars, importance of precise readability. The charts are all showing the all-time top 10 most streamed songs on Spotify, as of July 2019, with data from wikipedia.

Read on for the five options.

Comments closed

Chart Confusion with Labels

Mike Cisneros shows us an example where unexpected label values can throw off your readers:

The internet immediately latched onto the seemingly absurd collection of months portrayed in this chart. The bill, dating from June of 2019, included 13 prior months of usage from as early as August of 2016, as recently as March of 2019, and in a random order.

Soon, our non-U.S.-based friends pointed out that the dates made even less sense to them, as (of course) their convention is not to show dates in MM/YY format, but in YY/MM format.

And with this, the truth of the matter became obvious: the dates were in neither MM/YY format nor YY/MM format; they were in MM/DD format, and excluded labeling the year entirely. 

Even small things can make a difference in your ability to get the message across to users.

Comments closed

Drawing SSIS Packages as SVGs

Bartosz Ratajczyk continues a series on taking SSIS packages and generating SVGs from their control flows:

To make things harder, the layout of the sequences and tasks is not some nested XML structure. All of the elements have the same parent – <GraphLayout>, meaning all of them are at the same tree level. Also – there is no attribute showing where a particular object belongs. Almost. In the example with the sequences, I see two regularities:
– the outer container is placed later in the XML, than the inner container
– the @Id attributes show the nesting of the objects

I’m not sure how often I’d use this in practice, but if you want to understand some of the internals of SSIS, this is an interesting series to follow.

Comments closed

The Power of Hexagonal Binning

Capri Granville explains hexagonal binning to us and gives a few examples:

The reason for using hexagons is that it is still pretty simple, and when you rotate the chart by 60 degrees (or a multiple of 60 degrees) you still get the same visualization.  For squares, rotations of 60 degrees don’t work, only multiples of 90 degrees work. Is it possible to find a tessellation such that smaller rotations, say 45 or 30 degrees, leave the chart unchanged? The answer is no. Octogonal tessellations don’t really exist, so the hexagon is an optimum. 

Every time I see one of these, I think of old-timey strategy war games.

Comments closed

Choosing Colors for Visuals

Lewis Chou has some advice for choosing color schemes for data visualization:

When making a chart, we should use the same color scheme for the same metrics. And we need to avoid the excessive color interference to the user.

For example, when we do sales analysis, we usually analyze the indicators of sales and payment collection. Then, when we do data visualization analysis of different dimensions for the same indicator, we recommend using the same color system for sales and payment collection. It means that the sales amount can be indicated by the yellow-green color, and the return amount can be indicated by the blue color accordingly. After following the principle of consistency of indicator color, the user can quickly understand the meaning of the indicator expressed by the current data visualization chart according to the color distinction.

Color is a pre-attentive attribute: we sub-consciously pay attention to it before we consciously observe it. That has advantages but it also comes with responsibilities.

Comments closed

Comparing Sparklines

Lisa Charlotte Rost takes us through sparklines:

Sparklines are curious things. They’re supposed to show a trend, and a trend only. They’re supposed to show when something (like stocks) increase and decrease, where the peaks and the valleys are. But sparklines are not supposed to be comparable with each other.

So when you’re seeing two sparklines with the same height, the ebbs and flows of the first one could play out between 0 and 10 (e.g. US-Dollar), while the other sparkline’s peak is at 10,000.

But that’s odd, no? Doesn’t that invite people to make totally false assumptions?

I like sparklines a lot, but I’m apt to violate this particular rule and make them cross-comparable unless I know people will never care about comparisons between elements. One way to get around the “what if the range is big?” problem is to plot sparkline heights as logs so that 1000 is a bit bigger than 100, which is a bit bigger than 10. The argument I make for doing that is you still see size differences and sparkline comparisons are imprecise to begin with, so magnitudes are more important than exact values.

Comments closed

Creating R Visuals in Power BI

Dave Mason takes us through showing an R-based visual in Power BI:

The R engine isn’t included with the installation of Power BI desktop. I won’t go into detail on this, so just know you’d need to install that separately. I had already installed the R component as part of Machine Learning Services for SQL Server 2017. I also had RStudio installed. Within Power BI desktop, take a moment to click File | Options and settings | Options to open the Options page. Then click R scripting in the list of Global Options. Here you’ll see options to set the R home directory and the desired R IDE.

Click through for the demo.

Comments closed

Making Corporate Color Palettes Palatable

Meagan Longoria takes us through a corporate coloring problem:

Last week, I had a conversation on twitter about dealing with corporate color palettes that don’t work well for data visualization. Usually, this happens because corporate palettes are designed with websites and/or marketing collateral in mind rather than information graphic design. This often results in colors being too bright, dark, or dull to be used together in a report. Sometimes the colors aren’t easily distinguishable from each other. Other times, the colors needed for various situations (main color, ancillary colors, highlight color, error color, KPIs, text, borders) aren’t available in the corporate palette.

You can still stay on brand and create a consistent user experience with a color palette optimized for data visualization. But you may not be using the exact hex values as defined in the corporate palette. I like to say the data viz color palette is “inspired by” the marketing color palette.

Click through for lots of goodies, including a link to a really interesting color tester.

Comments closed