Press "Enter" to skip to content

Category: R

Managing Plot Parameters in R

Steven Sanderson switches up a visual:

When it comes to data visualization in R, the par() function is an indispensable tool that often goes overlooked. This function allows you to control various graphical parameters, unleashing a world of customization possibilities for your plots. In this blog post, we’ll demystify the par() function, break down its syntax, and provide you with hands-on examples to help you create stunning visualizations.

Click through to check it out. My loyalties definitely lie with ggplot2 for static visual development in R but it’s definitely not the only way to get images to look the way you want them.

Comments closed

Adding Text to a Plot in R

Steven Sanderson texts up a plot:

As a programmer, you’re well aware of the importance of data visualization. A well-crafted plot can convey complex information with clarity and impact. In R, creating stunning plots is a breeze, especially when you’re armed with the versatile text() function. This little gem allows you to add custom text to your plots, enabling you to annotate and highlight essential details. Let’s dive into the world of text() and uncover its syntax and potential through some hands-on examples.

I’m also a big fan of geom_text_repel() in ggplot2’s ggrepel library. It is by no means perfect but it does do a good job of not overlapping important visual features like plotted lines.

Comments closed

Learning about Data in R with str()

Steven Sanderson explains the value of the str() function:

In a nutshell, str() stands for “structure” and offers a concise summary of the structure of an R object. It presents essential details about the object, including its data type, dimensions, and the first few values. By providing an overview of your data, str() allows you to grasp the fundamentals at a glance and proceed with a clearer understanding of what you’re working with.

str() is a really useful function and people who develop objects in R thoughtfully can pack a lot of useful data into the one call.

Comments closed

Simplifying Nested Lists and Vectors in R

Steven Sanderson simplifies things:

Today, we’re diving deep into the incredible world of R programming to explore the often-overlooked but extremely handy unlist() function. If you’ve ever found yourself dealing with complex nested lists or vectors, this little gem can be a lifesaver. The unlist() function is like a magician that simplifies your data structures, making them more manageable and easier to work with. Let’s unlock its magic together!

Click through to see how it works, including explanation and examples.

Comments closed

ML Model Interactions and hstats

Michael Mayer has a new R package for us:

This post is mainly about the third approach. Its beauty is that we get information about all interactions. The downside: it is as good/bad as partial dependence functions. And: the statistics are computationally very expensive to compute (of order n^2).

Different R packages offer some of these H-statistics, including {iml}, {gbm}, {flashlight}, and {vivid}. They all have their limitations. This is why I wrote the new R package {hstats}:

Click through for an overview of the package and an example of how it works.

Comments closed

Set Intersection of Vectors in R

Steven Sanderson performs set intersection of vectors:

Welcome to another exciting blog post where we delve into the world of R programming. Today, we’ll be discussing the intersect() function, a handy tool that helps us find the common elements shared between two or more vectors in R. Whether you’re a seasoned R programmer or just starting your journey, this function is sure to become a valuable addition to your toolkit.

Click through to see how the function works.

Comments closed

Cumulative Means in R

Steven Sanderson performs a moving average:

The cumulative mean, also known as the running mean or moving average, provides us with a dynamic view of how the average value of a dataset changes as new observations are added incrementally. It is an invaluable tool in time-series analysis, trend identification, and smoothing noisy data.

Imagine you have a series of numeric values, and you want to find the average of the first observation, then the average of the first two observations, followed by the average of the first three, and so on. This iterative process generates the cumulative mean, painting a picture of how the data behaves over time.

Often times, we care about the moving average over a specific window, such as the last n periods. This particular post covers the moving average over the entire set of data.

Comments closed

Percentage by Group in R

Steven Sanderson performs a breakdown:

Calculating percentages by group is a common task in data analysis. It allows you to understand the distribution of data within different categories. In this blog post, we’ll walk you through the process of calculating percentages by group using three popular R packages: Base R, dplyr, and data.table. To keep things simple, we will use the well-known Iris dataset.

The Iris dataset contains information about different species of iris flowers and their measurements, including sepal length, sepal width, petal length, and petal width. We will focus on the ‘Species’ column and calculate the percentage of each species in the dataset.

Read on for the three approaches. I think the Tidyverse approach is the easiest to understand in this case, though all three get you to the answer.

Comments closed

Subsetting List Objects in R

Steven Sanderson makes a sub-list and checks it twice:

If you’re an aspiring data scientist or R programmer, you must be familiar with the powerful data structure called “lists.” Lists in R are collections of elements that can contain various data types such as vectors, matrices, data frames, or even other lists. They offer great flexibility and are widely used in many real-world scenarios.

In this blog post, we will explore one of the essential skills in working with lists: subsetting. Subsetting allows you to extract specific elements or portions of a list, helping you access and manipulate data efficiently. So, let’s dive into the world of list subsetting and learn some useful techniques along the way!

Read on for multiple ways of subsetting lists in base R.

Comments closed