Press "Enter" to skip to content

Category: Visualization

Building A Spinning Globe With R

James Cheshire shows how to use R to create an image of a spinning globe:

It has been a long held dream of mine to create a spinning globe using nothing but R (I wish I was joking, but I’m not). Thanks to the brilliant mapmate package created by Matt Leonawicz and shed loads of computing power, today that dream became a reality. The globe below took 19 hours and 30 processors to produce from a relatively low resolution NASA black marble data, and so I accept R is not the best software to be using for this – but it’s amazing that you can do this in R at all!

Now all that is missing is a giant TV and an evil lair.

Comments closed

Pretty R Plots

Simon Jackson has a couple posts on how to use ggplot2 to make graphs prettier.  First, histograms:

Time to jazz it up with colour! The method I’ll present was motivated by my answer to this StackOverflow question.

We can add colour by exploiting the way that ggplot2 stacks colour for different groups. Specifically, we fill the bars with the same variable (x) but cut into multiple categories:

Then he follows up with scatter plots:

Shape and size

There are many ways to tweak the shape and size of the points. Here’s the combination I settled on for this post:

There are some nice tricks here around transparency, color scheme, and gradients, making it a great series.  As a quick note, this color scheme in the histogram headliner photo does not work at all for people with red-green color-blindness.  Using a URL color filter like Toptal’s is quite helpful in discovering these sorts of issues.

Comments closed

Narrative Custom Visual

Devin Knight continues his Power BI custom visuals series:

In this module you will learn how to use the Narrative Power BI Custom Visual.  The Narrative visual is developed by Narrative Science and it gives you the ability to automatically deliver analysis of your data.  The results look similar to a final report that a analyst might provide after spending weeks with your data.

It’s not going to replace a seasoned analyst (or any analyst at all, frankly) but if you need a couple paragraphs of text summing up a trend, it’s a good start.

Comments closed

Outliers In Histograms

Edwin Thoen has an interesting solution to a classic problem with histograms:

Two strategies that make the above into something more interpretable are taking the logarithm of the variable, or omitting the outliers. Both do not show the original distribution, however. Another way to go, is to create one bin for all the outlier values. This way we would see the original distribution where the density is the highest, while at the same time getting a feel for the number of outliers. A quick and dirty implementation of this would be

hist_data %>% 
  mutate(x_new = ifelse(x > 10, 10, x)) %>% 
  ggplot(aes(x_new)) +
  geom_histogram(binwidth = .1, col = "black", fill = "cornflowerblue")

Edwin then shows a nicer solution, so read the whole thing.

Comments closed

Time Brush Custom Visual

Devin Knight continues his Power BI custom visuals series:

In this module you will learn how to use the Time Brush Power BI Custom Visual.  The Time Brush gives you the ability both filter your report and see a graphics representation of your data at the same time. The name Time Brush comes from the behavior used when you select the values you’d like to filter.

The use of color is an interesting take on combining continuous data points with categorical representations of those points.

Comments closed

Ignoring SSAS Dynamic Formatting

Chris Webb shows that tools like Power BI ignore formatting in SCOPE statements:

What’s more (and this is a bit strange) if you look at the DAX queries that are generated by Power BI to get data from the cube, they now request a new column to get the format string for the measure even though that format string isn’t used. Since it increases the amount of data returned by the query much larger, this extra column can have a negative impact on query performance if you’re bringing back large amounts of data.

There is no way of avoiding this problem at the moment, unfortunately. If you need to display formatted values in Power BI you will have to create a calculated measure that returns the value of your original measure, set the format string property on that calculated measure appropriately, and use that calculated measure in your Power BI reports instead:

Click through for more details and a workaround.

Comments closed

Basics Of R Plotting

Aman Tsegai shows some basic ways to customize R’s plot function:

We’re going to be using the cars dataset that is built in R. To follow along with real code, here’s an interactive R Notebook. Feel free to copy it and play around with the code as you read along.

So if we were to simply plot the dataset using just the data as the only parameter, it’d look like this:

plot(dataset)

The plot function is great for cases where you don’t much care how the visual looks, and the simplicity is great for throwaway visuals.

Comments closed