Press "Enter" to skip to content

Category: R

Log-Log Plots in R

Steven Sanderson thinks in percentages:

A log-log plot is a type of graph where both the x-axis and y-axis are in logarithmic scales. This is particularly useful when dealing with data that spans several orders of magnitude. By taking the logarithm of the data, we can compress large values and reveal patterns that might be hidden on a linear scale.

Let’s start with a simple example using base R.

Read on to see how you can create these plots and what you can do to customize them.

Comments closed

Plotting Logistic Regression in R

Steven Sanderson performs a logistic regression:

Logistic regression is a statistical method used for predicting the probability of a binary outcome. It’s a fundamental tool in machine learning and statistics, often employed in various fields such as healthcare, finance, and marketing. We use logistic regression when we want to understand the relationship between one or more independent variables and a binary outcome, which can be “yes/no,” “1/0,” or any two-class distinction.

Click through to learn how to do this.

Comments closed

Enabling Python and R Support for VS Code Polyglot Notebooks

Joy George Kunjikkur enables a preview option:

Obviously, we should have Polyglot notebooks up and running. The first step to enable Python preview is that we need to install Jupyter on the machine and make sure the Python kernel spec is available. Run the below command to make sure it is there.

It looks like what the preview is doing is shelling out to Jupyter notebooks, so I’d imagine variables won’t cross over between languages.

Comments closed

Building a Bland-Altman Plot in R

Steven Sanderson performs a comparison:

Before we dive into the code, let’s briefly understand what a Bland-Altman plot is. It’s a graphical method to visualize the agreement between two measurement techniques, often used in fields like medicine or any domain with comparative measurements. The plot displays the differences between two measurements (Y-axis) against their means (X-axis).

Click through to see how this works and how you can interpret the results.

Comments closed

Functional Programming and R

Anirban Shaw ties functional programming to R:

Functional Programming‘s relevance in the R programming language, a language primarily known for its prowess in data analysis and statistical computing, is particularly noteworthy. By leveraging functional programming, organizations can improve operational efficiency and gain a competitive edge

R’s ecosystem is enriched by functional programming paradigms, which enable developers and data scientists to write concise and expressive code for tasks such as data manipulation, transformation, and visualization.

In this article, we take a deep dive into the fundamental characteristics of R, the advantages of adopting functional programming within it and the essential concepts ingrained in the core of R. 

Read on to see how the two fit together. H/T R-Bloggers.

Comments closed

Scree Plots in R

Steven Sanderson builds a scree plot:

A scree plot is a line plot that shows the eigenvalues or variance explained by each principal component (PC) in a Principal Component Analysis (PCA). It is a useful tool for determining the number of PCs to retain in a PCA model.

In this blog post, we will show you how to create a scree plot in base R. We will use the iris dataset as an example.

Read on to learn more about the plot, as well as examples of how to create scree plots.

Comments closed

Bubble Charts in ggplot2

Steven Sanderson creates a bubble chart:

Bubble charts are a great way to visualize data with three dimensions. The size of the bubbles represents a third variable, which can be used to show the importance of that variable or to identify relationships between the three variables.

To create a bubble chart in R using ggplot2, you will need to use the geom_point() function. This function will plot points on your chart, and you can use the size aesthetic to control the size of the points.

Click through for two examples, one which is a pretty good outcome for using a bubble chart, and one which exposes the key weakness of bubble charts.

Comments closed

Several Useful R Functions

Maelle Salmon shows off four useful R functions:

Recently I caught myself using which(grepl(...)),

animals <- c("cat", "bird", "dog", "fish")
which(grepl("i", animals))
#> [1] 2 4

when the simpler alternative is

animals <- c("cat", "bird", "dog", "fish")
grep("i", animals)
#> [1] 2 4

Read on for another example of using grep() instead of grepl(), as well as three other functions you might want to keep in mind. H/T R-Bloggers.

Comments closed

Creating Pareto Charts in R with qcc

Steven Sanderson builds a Pareto chart:

A Pareto chart is a type of bar chart that shows the frequency of different categories in a dataset, ordered by frequency from highest to lowest. It is often used to identify the most common problems or causes of a problem, so that resources can be focused on addressing them.

To create a Pareto chart in R, we can use the qcc package. The qcc package provides a number of functions for quality control, including the pareto.chart() function for creating Pareto charts.

Manufacturing companies love Pareto charts

Comments closed

Exploring Poker Hands in R

Benjamin Smith sorts and deals:

Recently, I have been reading “Mathematical Statistics” by Professor Keith Knight and I noticed a interesting passage he mentions when discussing finite sample spaces:

*In some cases, it may be possible to enumerate all possible outcomes, but in general such enumeration is physically impossible; for example, enumerating all possible 5 card poker hands dealt from a deck of 52 cards would take several months under the most
favourable conditions. * (Knight 2000)

While this quote is taken out of context, with the advent of modern computing this is a task which is definitely possible to do computationally!

Click through to see how you can do this in R, at least for 5-card stud. 5-card draw would have the same number of final combinations, though if you also track intermediary combinations, it would grow rather considerably.

Comments closed