Press "Enter" to skip to content

Category: Visualization

Plotting SVM Decision Boundaries in R

Steven Sanderson goes right up to the edge:

Support Vector Machines (SVM) are a powerful tool in the world of machine learning and classification. They excel in finding the optimal decision boundary between different classes of data. However, understanding and visualizing these decision boundaries can be a bit tricky. In this blog post, we’ll explore how to plot an SVM object using the e1071 library in R, making it easier to grasp the magic happening under the hood.

Read on to see how you can perform this analysis as well.

Comments closed

Appropriate Uses of Jitter in Graphs

Steven Sanderson shakes things up:

As an R programmer, one of the most useful functions to know is the jitter function. The jitter function is used to add random noise to a numeric vector, which can be helpful when visualizing data in a scatterplot. By using the jitter function, we can get a better picture of the true underlying relationship between two variables in a dataset.

Read on to get an idea of how to use jitter, though I recommend making it very clear to chart viewers that you are, in fact, using jitter, as it can be easy to misinterpret the jitter as actual value locations.

Comments closed

Kernel Density Plots in R

Steven Sanderson explains one common type of plot in R:

Kernel Density Plots are a type of plot that displays the distribution of values in a dataset using one continuous curve. They are similar to histograms, but they are even better at displaying the shape of a distribution since they aren’t affected by the number of bins used in the histogram. In this blog post, we will discuss what Kernel Density Plots are in simple terms, what they are useful for, and show several examples using both base R and ggplot2.

Read on to learn more, including how to generate these in base R, ggplot2, and with the tidy_density package.

Comments closed

The Value of KPIs and Cards in Power BI

Kurt Buhler and Stepan Resl give you a card:

When a user arrives at your report, they should be able to answer their most important questions in a few seconds. To do this, we typically put the most critical information in the top-left of the report (where we often look first). This information should provide a high-level overview, whereas additional details should be placed at the bottom of the report, behind interactions, or on later pages.

An effective and popular way to call attention to important numbers in Power BI is by using cards and KPI core visuals.

Read on for several examples and a breakdown of how they work best.

Comments closed

Building Correlation Heatmaps in R

Steven Sanderson shows two packages for building heatmaps in R:

Data visualization is a powerful tool for understanding the relationships between variables in a dataset. One of the most common and insightful ways to visualize correlations is through heatmaps. In this blog post, we’ll dive into the world of correlation heatmaps using R, using the mtcars and iris datasets as examples. By the end of this post, you’ll be equipped to create informative correlation heatmaps on your own.

Read on to see how to build heatmaps with the corrplot and ggcorrplot packages.

Comments closed

Visualizing when Lower is Better

Alex Velez inverts a common experience:

When quickly scanning, I wonder why the direct and indirect sales teams underperformed in 2022. Mostly, they fell below the goal of 90 days, exceeding their target only three times. 

Now, pausing to think more critically about the context of this scenario, I realize I’ve misread the graph—specifically the goal line. Targets and goals are often seen as minimum thresholds, not maximum limits. But in the sales industry, the goal is to close a deal as quickly as possible. In this visual, below the goal line is actually a good thing!

This graph challenges my standard construct of targets and goals, which could lead to confusion or, worse, the wrong conclusions if I’m not careful. 

Read on for five alternative ways to display this graph and (hopefully) reduce confusion.

Comments closed

Setting Table and Matrix Column Widths in Power BI

Kurt Buhler controls the horizontal, Kurt Buhler controls the vertical:

One challenge of the table and matrix visuals in Power BI is that it’s difficult to precisely and consistently set column widths. Unlike in Excel, where you can set the row and column widths in a spreadsheet, you have no option in the visual interface to control the column width property. However, it’s still possible to control it in the report metadata, which is exposed in the officially supported Power BI Projects format (.pbip) which is in preview. Notably, however, opening and modifying report metadata from this format isn’t yet supported. Despite that fact, it still works reliably, so I thought I’d demonstrate how to do this.

There are a fair number of steps involved but it all makes sense in the end.

Comments closed

Plotting Multiple Histograms in R

Steven Sanderson shows us two libraries to plot two histograms:

Histograms are a powerful tool for visualizing the distribution of numerical data. They allow us to quickly understand the frequency distribution of values within a dataset. In this tutorial, we’ll explore how to create multiple histograms using two popular R packages: base R and ggplot2. By the end of this guide, you’ll be able to confidently display multiple histograms on a single graph using both methods.

Click through for more than two examples.

Comments closed

Visualizing Univariate Data Distributions in R

Steven Sanderson reviews the shape of the data:

Understanding the distribution of your data is a fundamental step in any data analysis process. It gives you insights into the spread, central tendency, and overall shape of your data. In this blog post, we’ll explore two popular functions in R for visualizing data distribution: density() and hist(). We’ll use the classic Iris dataset for our examples. Additionally, we will introduce the {TidyDensity} library and show how it can be used to create distribution plots.

Click through for three different functions for visualizing the density of a variable.

Comments closed

Adding Mean to Box Plots in R

Steven Sanderson tracks the sixth number of a five-number summary:

Data visualization is a powerful tool for understanding and interpreting data. In this blog post, we will explore how to create box plots with mean values using both base R and ggplot2. We will use the famous iris dataset as an example. So, grab your coding tools and let’s dive into the world of box plots!

Note that this is mean in addition to median in these visuals, not replacing the median.

Comments closed