Press "Enter" to skip to content

Category: R

tapply() and Ragged Arrays in R

Benjamin Smith explains how tapply() works:

While I saw other programmers use this function, I found myself unsure how of how it worked or knew when I would need to use it. In this blog I attempt to change that and explain the cryptic description by showing some applications with my commentary and how it compares to using the “tidy” approach with tidyverse.

My inspiration for writing this blog was from seeing Dr. Norm Matloff’s blog where he mentions the use of tapply() and his thoughts on the tidyverse. For a more thorough treatment on his critique of the tidyverse and “tidy” methods, check out his formal essay here.

Read on to learn the benefit of learning and using tapply().

Comments closed

Querystrings and R Shiny

Thomas Williams passes along querystring data:

As background, a query string is part of a web page address. Query strings are used to pass information to web pages, in name/value pairs separated by an equals sign – for instance, user=Andrew or country=au. Name/value pairs are themselves separated by ampersands, so passing multiple values looks like user=Andrew&country=au.

Click through for an example of how it all works.

Comments closed

Interpreting Kernel SHAP

Michael Mayer digs into Kernel SHAP:

In their 2017 paper on SHAP, Scott Lundberg and Su-In Lee presented Kernel SHAP, an algorithm to calculate SHAP values for any model with numeric predictions. Compared to Monte-Carlo sampling (e.g. implemented in R package “fastshap”), Kernel SHAP is much more efficient.

I had one problem with Kernel SHAP: I never really understood how it works!

Needless to say, Michael knows Kernel SHAP a lot better now, considering there’s now a kernelshap package for us.

Comments closed

Calculating the Hurst Exponent in R

Sang-Heon Lee does some analysis:

Pairs trading literature use the Hurst exponent frequently since it gives an simple and intuitive indicator for the behavior of stock returns. Using S&P 500 returns, let’s learn how to estimate it using R code manually and then use R package conveniently.

Click through for those two examples, as well as a more detailed explanation of the math driving this. H/T R-Bloggers.

Comments closed

Solving the Traveling Salesman Problem in R

Tomaz Kastrun gives us a solution to the Traveling Salesman Problem:

Travelling Salesman Problem is an NP-complete problem and an old mathematical problem. For this useless function, we will look for the nearest city from the previous city (or starting point) and repeat until we visit all cities. The greedy solution is fairly simplified but one disadvantage; it might not give you the best path (optimal solution) and proving that the solution is correct is an additional issue 

As Tomaz notes, this is not guaranteed to be the best solution, just a solution. Considering that TSP is NP-hard, if Tomaz did have a globally optimal solution for us, he certainly wouldn’t be calling it ‘useless-useful’ but instead would be calling it “My prize-winning algorithm.”

Comments closed

Custom Infix Functions in R

Dominik Rafacz loves infix functions:

Custom infix functions are one of my favorite features in R. This article is my love letter to them. But first, a quick recap.

For those unfamiliar with the terminology, infix function is a function fun which is called using infix notation, e.g., x fun y instead of fun(x, y). Those functions are also called infix operators by base R, and I will use those terms and name infixes interchangeably. There are a lot of infix operators in base R used very frequently, i.e., arithmetic or logical operators. We use them so often that we usually forget that they are functions. And that we can call them just like regular functions.

Infix functions are something I tend to forget entirely about developing on my own but they can be extremely useful, as Dominik shows. H/T R-Bloggers.

Comments closed

What’s in a Name?

Benjamin Smith analyzes a name change:

Recently, RStudio announced its name change to Posit. For many this name change was accepted with open arms, but for some-not so. Being the statistician that I am I decided to post a poll on LinkedIn to see the sentiment of my network. After running the poll for a week the results were in:

Read on for the responses as well as an analysis using RSTAN.

Comments closed

Hosting an App on RStudio Connect

Liam Kalita wraps up a series:

So far, we have seen how to create an app using ReactJS and and a Plumber API. In part 3, we will show you how to host the application on RStudio Connect (RSC)!

When it comes to hosting the application on RSC we will set the content URL for both the app and API so that they are in the same domain and won’t have this CORS issue.

Read the whole thing.

Comments closed

Recreating a Shiny App with Plumber and React

Liam Kalita continues a series:

We’ll assume you have a basic understanding of HTML and JavaScript, but you should be able to follow along with a basic programming background. Having a little knowledge of Linux shell commands would be beneficial for some of the terminal commands for generating directories, but you can also do most of it in VSCode using the user interface instead.

Let’s attempt an exercise in creating a small React+Plumber app; this will be very similar to a previous blog post recreating this tutorial {shiny} application using Python Flask.

Click through to see how to build the app. The final part of the series will show how to host the app.

Comments closed

Extracting Numbers from a Stacked Density Plot

Derek Jones digs into an image:

A month or so ago, I found a graph showing a percentage of PCs having a given range of memory installed, between March 2000 and April 2020, on a TechTalk page of PC Matic; it had the form of a stacked density plot. This kind of installed memory data is rare, how could I get the underlying values (a previous post covers extracting data from a heatmap)?

Read on for an interesting attempt at reverse-engineering the original numbers used to create an image. H/T R-Bloggers.

Comments closed