Press "Enter" to skip to content

Category: Visualization

An Intro to k-Means Clustering

Holger von Jouanne-Diedrich takes us through an example of how k-means clustering works:

The guiding principles are:

– The distance between data points within clusters should be as small as possible.
– The distance of the centroids (= centres of the clusters) should be as big as possible.

Because there are too many possible combinations of all possible clusters comprising all possible data points k-means follows an iterative approach

Click through for a demonstration. I appreciate adding visualizations for intermediate steps in there as well because it gives you an intuitive understanding for what the one-liner function is really doing.

Comments closed

Easy Navigation with Power BI

Marc Lelijveld has started a series on storytelling with Power BI. Part one is all about navigation:

Providing an easy navigation is important for the usability of your report. In order to make it as intuitive as possible, you should think as an end-user. By opening a report, where do you expect the navigation to be?

If you open the first webpage you can think of, most likely you will find the navigation on top or on the left side. Which is totally reasonable (at least according to my opinion) since we read from left to right and from top to bottom.

By default in Power BI you will have the navigation on the bottom where you can switch between your report pages. But we just concluded that it is more intuitive to put your navigation on top or on the left side. So why not do the same with your Power BI reports? We can do this by creating our own navigation and bookmarks for that!

Read the whole thing.

Comments closed

Using the Power BI Visual Header Tooltip

Prathy Kamasani gives us several good uses of the Power BI visual header tooltip:

When we look at data journalism posts, most of the times they have annotations, explaining what visual showing or talking about measures. Again most of these data storeys are used for paper. But in the digital world, we do see these annotations more interactively. It is nice to have this kind of lil annotations for everyday reporting as well, and Tooltip Icon can be used for that purpose. Another thing is using canvas space wisely, it is important, and having this kind of hint helps us on saving the canvas space.

Click through for instructions on how to enable this as well as smart ways to use them.

Comments closed

Equidistant Points and Missing Data in Excel

Stephanie Evergreen shows how you can bring in missing data points in Excel to ensure the axis is accurate:

Excel automatically spaces your intervals and labels equidistant from one another but it is assuming that your intervals actually are equidistant. In this graph, that’s not the case. We are missing the months of March, April, July, and August, when either no one was enrolled in the study or we have some missing data. But we can’t just gloss over those months. It isn’t truthful and it distorts the data display.

Click through for the solution.

Comments closed

Bar Chart Presentation Options

Andy Kirk gives us five techniques for gussying up bar charts:

“Bar charts are boring”, say many people. “How can we make them more attractive”, say many desperate clients. Bar charts are ubiquitous because they are the reliable and trusted lieutenants often relied upon to show the always-common quantitative comparisons across different categories. Their frequent use can induce ‘boredom’ through the familiar but, in particular, accusations of inelegance can be raised with the default tool styles many creators lean on.

The five charts below just offer some different ways you might style them, through variations in the use of functional colour properties, chart apparatus and layout decisions, in particular. The reasons why you would choose to use any of these methods are varied and especially contextually dependent, based on matters like space to work in, range of quantitative values, size of category labels, number of bars, importance of precise readability. The charts are all showing the all-time top 10 most streamed songs on Spotify, as of July 2019, with data from wikipedia.

Read on for the five options.

Comments closed

Chart Confusion with Labels

Mike Cisneros shows us an example where unexpected label values can throw off your readers:

The internet immediately latched onto the seemingly absurd collection of months portrayed in this chart. The bill, dating from June of 2019, included 13 prior months of usage from as early as August of 2016, as recently as March of 2019, and in a random order.

Soon, our non-U.S.-based friends pointed out that the dates made even less sense to them, as (of course) their convention is not to show dates in MM/YY format, but in YY/MM format.

And with this, the truth of the matter became obvious: the dates were in neither MM/YY format nor YY/MM format; they were in MM/DD format, and excluded labeling the year entirely. 

Even small things can make a difference in your ability to get the message across to users.

Comments closed

Drawing SSIS Packages as SVGs

Bartosz Ratajczyk continues a series on taking SSIS packages and generating SVGs from their control flows:

To make things harder, the layout of the sequences and tasks is not some nested XML structure. All of the elements have the same parent – <GraphLayout>, meaning all of them are at the same tree level. Also – there is no attribute showing where a particular object belongs. Almost. In the example with the sequences, I see two regularities:
– the outer container is placed later in the XML, than the inner container
– the @Id attributes show the nesting of the objects

I’m not sure how often I’d use this in practice, but if you want to understand some of the internals of SSIS, this is an interesting series to follow.

Comments closed

The Power of Hexagonal Binning

Capri Granville explains hexagonal binning to us and gives a few examples:

The reason for using hexagons is that it is still pretty simple, and when you rotate the chart by 60 degrees (or a multiple of 60 degrees) you still get the same visualization.  For squares, rotations of 60 degrees don’t work, only multiples of 90 degrees work. Is it possible to find a tessellation such that smaller rotations, say 45 or 30 degrees, leave the chart unchanged? The answer is no. Octogonal tessellations don’t really exist, so the hexagon is an optimum. 

Every time I see one of these, I think of old-timey strategy war games.

Comments closed

Choosing Colors for Visuals

Lewis Chou has some advice for choosing color schemes for data visualization:

When making a chart, we should use the same color scheme for the same metrics. And we need to avoid the excessive color interference to the user.

For example, when we do sales analysis, we usually analyze the indicators of sales and payment collection. Then, when we do data visualization analysis of different dimensions for the same indicator, we recommend using the same color system for sales and payment collection. It means that the sales amount can be indicated by the yellow-green color, and the return amount can be indicated by the blue color accordingly. After following the principle of consistency of indicator color, the user can quickly understand the meaning of the indicator expressed by the current data visualization chart according to the color distinction.

Color is a pre-attentive attribute: we sub-consciously pay attention to it before we consciously observe it. That has advantages but it also comes with responsibilities.

Comments closed