Press "Enter" to skip to content

Category: Visualization

Publishable Adverse Event Tables in R

Inge Christoffer Olsen shows how to clean up tables in R for publication:

The summary of Adverse Events is a nice table just summing up the adverse events in the trial. Note the “[N] n (%)”-format which is the number of events, number of patients with events and percentage of patients with event.

This particular example is about adverse events, but the key concepts in the code apply to many kinds of tables you want to make look a bit nicer. H/T R-Bloggers

Comments closed

Audio Analysis in R

Jeroen Ooms walks us through some audio analysis with R and the av package:

The latest version of the rOpenSci av package includes some useful new tools for working with audio data. We have added functions for reading, cutting, converting, transforming, and plotting audio data in any popular audio / video format (mp3, mkv, aac, etc).

The functionality can either be used by itself, or to prepare audio data for further analysis in R using other packages. We hope this clears an important hurdle to use R for research on speech, music, and whale mating calls.

One of the most interesting things I saw Edward Tufte demonstrate was visualizing music using the Music Animation Machine. There’s a lot of space here to experiment. H/T R-Bloggers.

Comments closed

Fun with Palindromic Dates

Tomaz Kastrun has a bit of fun with the date February 2, 2020:

As of writing this blog-post, today is February 2nd, 2020. Or as I would say it, 2nd of February, 2020. There is nothing magical about it, it is just a sequence of numbers. On a boring Sunday evening, what could be more thrilling to look into this little bit further 🙂

Let’s kick R Studio and start writing a lot of useless stuff.

Tomaz also compares US versus EU palindromic dates and visualizes the different distributions.

Comments closed

Dealing with Big Ranges in a Graph

Alex Velez shows how we can work with a particular case of problem:

Today’s post is about a common challenge: when one data series is so large relative to the others that a single scale makes it nearly impossible to see any details. Consider the following line graph. It displays state and local revenue by transportation mode, which I created using data from the Bureau of Transportation Statistics 2018 Report.

Alex has one solution. Another idea could be to change the Y axis to log scale, especially because you’re dealing with money. That would tighten up the series and allow for more information to be displayed on the single graph.

Comments closed

Building a Dual-Axis Line Chart in Power BI

Matt Allington shows how you can build a dual-axis line chart in Power BI:

Unfortunately, Power BI does not support a dual axis line chart as a standard visual at this time. The good news however is there is a custom visual called “Multiple Axes chart by xViz” that can do this in Power BI.  This visual has been around for a while, but there have been some formatting issues (in my view) that prevented it being a solution to this problem – that is now fixed).  I will demonstrate how to set up a dual axis charge using the Adventure Works database and this visual.

Honestly, I’m pretty happy that Power BI does not support a dual-axis line chart. It is the cause of so many instances of spurious correlation that I’d err on the side of not including multiple axes.

Comments closed

Displaying SSRS Usage Stats Through Grafana

Alessandro Alpi takes queries to view SQL Server Reporting Services data and visualize it in Grafana:

One of the problems that often occur in our organization as well as some of our customers, is to get immediate feedback about usage statistics of reports. Usually, the request of creating reports is out of control and some of them are executed only “that time” and not anymore. In the worst-case scenario, many of them aren’t executed at all and some of them could become even overlapped or duplicated.

Therefore, it is important to know the usage statistics, user by user and report by report, to make the reader aware of them, let him interpreting the values of the same query in multiple ways and graphical layouts. While this is not possible with a tabular format (unless you export the values using any external tools such as Excel) it is simpler when it comes to a dashboard.

And that’s where Grafana excels.

Comments closed

Using Trellis Charts to Display Small Multiples Over Time

Mike Cisneros shows us the evolution of three-point shooting in the NBA using a trellis chart:

 This small multiple chart shows two variables for each team in the league for each of the last 30 seasons: on the x-axis, the number of 3-pointers attempted per game; on the y-axis, the percent of attempted 3-point shots that were successful. Each point is a single team in a single season. The individual panels step you forward in time as the data changes and evolves. They help you see how the pack of all NBA teams is inexorably moving towards more and more 3-point attempts per game (the data points shift rightwards as you progress through the frames). We can also see that there are no longer any teams with sub-30% shooting percentages on those attempts (illustrated by tighter clustering upwards as you move forward in time).

This is a good way of showing movement over time in a static medium, like a printed page. If you’re giving a presentation, this would probably be a bubble chart with a play axis.

Comments closed

Using the Power BI Color Picker

David Eldersveld walks us through the Power BI color picker:

The new color picker allows colors in RGB format in addition to the hex color format that Power BI has used exclusively until now.

The new one also easily allows users to choose from a wider selection of shades and tones. This builds upon the simpler selection of hues and tints in the original.

In case you don’t know what David means, there is an excellent explanation of each term.

Comments closed

Data Visualization in R and Python

Michelle Golchert contrasts libraries for visualizing data in R and Python:

Unlike R, Python – as a “general-purpose” programming language – does not include data visualization tools by default. However, Python also provides many libraries for this purpose, such as Matplotlib and Seaborn.

Python now also offers numerous packages (like plotnine and ggpy) which are equivalents of ggplot2 in R, and allow you to create plots in Python according to the same “Grammar of Graphics” principle.

This is an area where I think R has the upper hand at most levels: it’s easier to get started plotting with R (thanks to the built-in plots), it’s easier to do “intermediate-quality” plots (stuff you would use in an internal presentation), and you tend to have more control when building professional-quality plots. You can certainly create beautiful visuals in both languages, though.

Comments closed

Conditional Formatting Line and Area Charts with Power BI

Soheil Bakkshi shows how we can conditionally format line and area charts with Power BI:

One of my customers asked me to show time series in line charts and area charts. But she want’s it to be conditionally formatted based on the average value over time. Let’s keep it simple, she wants to show “Sales by Year Month” in line chart, but, highlight the data points that are below “Average Sales per Year Month”. As you may know, we currently do not have the luxury of formatting line charts and area charts. But wait, this post is all about that. Let’s dig into it.

From the above scenario, you perhaps already guessed that we need to create a measure which defines the colour based on “Average Sales per Year Month” to be able to format the chart conditionally. If any data point is below the “Average Sales per Year Month” then we highlight it in Orange, if it is above the “Average Sales per Year Month” then we stick to the default colour.

Let’s do it.

This is definitely not straightforward, but once you see the process, it’s pretty neat.

Comments closed