Press "Enter" to skip to content

Category: R

Caching in R

Bernardo Lares takes us through the lares library’s caching functionality:

If you’ve never heard of cache (/kaSH/) before, Google it and you’ll quickly find that it is “a collection of items of the same type stored in a hidden or inaccessible place”. Basically, you have “something” stored “somewhere” so you can fetch it “sometime” later. If it sounds basic, it (can be) is! This simple technic can come quite handy when you are coding functions that take some time to gather and/or process the data you’re working with. In other words, think of those processes that take some time to run and there’s really no need to re-run it “every time” because the outcome will be exactly the same. Also, you are unnecessarily spending time, computer power, and real energy when you re-process cache-able stuff.

Today I’ll show you how I use cache in R to accelerate results, avoid re-processing, and improve UX for my users using the lareslares library. Let’s see a couple of functions that actually leverage cache usage and how can you start using them.

Read on for a walkthrough of the process.

Comments closed

Standalone Shiny Apps with systemd

Peter Solymos takes us through configuring a Shiny application which runs via systemd:

Lots of resources describe how you can host Shiny apps with Docker, Shiny Server, or via other means. But we also know Shiny apps can be launched locally. What makes your local setup different from these other options is that your local machine does not usually have a static internet protocol (IPv4) address. Without a static IPv4, it is really hard to share the app with other people because the address keeps changing unpredictably, and you might sometimes power off your machine.

Shiny uses the httpuv R package under the hood which is an HTTP and websocket server library. Could you just run Shiny directly on a remote server? This post explores this topic using systemd the system and service manager for most modern Linux distributions. All I am trying to do in this post is to make a point that the shiny R package is really self-sufficient and in the simplest case, it does not need any other layer for sharing an app.

Click through for the process. H/T R-Bloggers.

Comments closed

Comparing Datasets in R

The folks at finnstats take us through a package to compare datasets in R:

How to find dataset differences in R, when the pieces of information are changing between datasets it’s a difficult task to identify the same.

Here we are going to discuss the daff package in R, daff package helps us to identify the differences and visualize them in a beautiful way.

Click through for the demonstration, including a video. H/T R-Bloggers

Comments closed

New Features in R 4.1.0

The Jumping Rivers folks have some good news for us:

The stability of the base packages is a great strength of the R ecosystem: both underpinning, and contrasting with, the rapid pace at which contributed packages (CRAN, BioConductor) evolve.

Imagine introducing a new feature into the R language. Even if problems arise with the usability of that feature, it would need to be maintained until at least the next major release, by which time thousands of developers and analysts may depend upon it. Unsurprisingly, the R maintainers are exceedingly cautious when introducing new syntax.

Similarly, you should employ caution when using new syntax in your own code. If you do use syntax that was introduced in R-4.1, be aware that your code will not run on versions of R that precede this. For example, this may prevent your new analysis scripts from running on your colleague’s computer, or prevent users from installing your new package.

Given how many third-party packages have regular breaking changes, I do wish more people would follow this advice.

Getting into the meat of things, I really like the F#-style pipe in R: |> makes a lot of intuitive sense, though I do wish they had included a placeholder element with the native pipe.

Comments closed

Plotting Correlation Analyses in R

Finnstats shows a few techniques for plotting correlation in R:

Correlation analysis, correlation is a term that is a measure of the strength of a relationship between two variables.

Pearson’s Product-Moment Correlation

One of the most common measures of correlation is Pearson’s product-moment correlation, which is commonly referred to simply as the correlation, or just the letter r.

Correlation shows the strength of a relationship between two variables and is expressed numerically by the correlation coefficient.

Click through for examples from several packages. H/T R-Bloggers.

Comments closed

Hot, Cool, and Large Numbers

Holger von Jouanne-Diedrich hits the casino:

The longest streak in roulette purportedly happened in 1943 in the US when the colour red won 32 consecutive times in a row! A quick calculation shows that the probability of this happening seems to be beyond crazy:

0.5^32[1] 2.328306e-10

So, what is going on here? For once streaks and clustering happen quite naturally in random sequences: if you got something like “red, black, red, black, red, black” and so on I would worry if there was any randomness involved at all (read more about this here: Learning Statistics: Randomness is a strange beast). The point is that any sequence that is defined beforehand is as probable as any other (see also my post last week: The Solution to my Viral Coin Tossing Poll). Yet streaks catch our eye, they stick out.

There’s one critical assumption in this post, which is that the game is fair, in that each event has an equal probability of happening. But as a Bayesian, if a roulette table hits red 32 times in a row, it certainly opens the door to the idea that maybe the odds on that table with that dealer aren’t quite equal between red and black.

Comments closed

Containerizing a Shiny App

Peter Solymos takes us through the process of running a Shiny app in a Docker container:

Docker provides isolation to applications. Images are immutable. Running multiple instances of the same image can serve many users at the same time. All these general advantages of containerized applications apply to Shiny apps too.

All the general advantages of containerized applications apply to Shiny apps. Docker provides isolation to applications. Images are immutable: once build it cannot be changes, and if the app is working, it will work the same in the future. Another important consideration is scaling. Shiny apps are single threaded, but running multiple instances of the same image can serve many users at the same time. Let’s dive into the details of how to achieve this.

Click through for a walkthrough. Containerizing these sorts of apps has been a boon for my team, as it lets us spin up appropriately-sized servers on the cheap. H/T R-Bloggers

Comments closed

Functional Data Analysis in R

Joseph Ricker gives us a gentle introduction to a not-so-gentle topic:

This plot might depict 80 measurements for a participant in a clinical trial where each data point represents the change in the level of some protein level. Or it could represent any series of longitudinal data where the measurements are take at irregular intervals. The curve looks like a time series with obvious correlations among the points, but there are not enough measurements to model the data with the usual time series methods. In a scenario like this, you might find Functional Data Analysis (FDA) to be a viable alternative to the usual multi-level, mixed model approach.

This post is meant to be a “gentle” introduction to doing FDA with R for someone who is totally new to the subject. I’ll show some “first steps” code, but most of the post will be about providing background and motivation for looking into FDA. I will also point out some of the available resources that a newcommer to FDA should find helpful.

Read on to learn more.

Comments closed

Table Design in R with mmtable2

Matt Dancho walks through a package to make tables look great in R:

I love ggplot2 for plotting. The grammar of graphics allows us to add elements to plots. Tables seem to be forgotten in terms of an intuitive grammar with tidy data philosophy – Until now. mmtable2 aims to be the ggplot2 for tables, leveraging the awesome GT table package.

The mmtable2 package aims to make it easy to create tables by:

1. Using a ggplot2-style syntax for using a grammar of table operations.

2. Extends the amazing GT table package.

Read on for the process and a demonstration.

Comments closed