R – Curated SQL

Accessing Home Assistant’s InfluxDB Instance from R

Published 2025-11-18 by Kevin Feasel

I’m running a HomeAssistant instance at home. I’ve configured it to log data into an InfluxDB database, so I can retrieve historical data for analysis later on. In default mode HomeAssistant would aggregate historical data for storage reasons.

So now I want to access the InfluxDB database from R to perform custom analyses. HomeAssistant is still using InfluxDB version 1. To connect to InfluxDB from R, I thought I can use the influxdbr package. But I got some errors because this package seems to be outdated.

Read on for the error message and how Martin was able to get around this. H/T R-Bloggers.

Finding Substrings in Pi

Published 2025-11-11 by Kevin Feasel

Tomaz Kastrun has some fun with Pi:

We will do this in the following steps (for the word “eggs”):

Encode the word EGGS to numbers. E = 5, G = 7, G = 7 and S = 19. Together concatenated we get the string of 57719.

We store a veeery long string of PI number.

Start looking in PI number for substring of “57719”.

Click through for the R code.

Creating Your Own ggplot2 Geom

Published 2025-11-04 by Kevin Feasel

Isabella Velasquez is feeling creative:

If you use ggplot2, you are probably used to creating plots with geom_line() and geom_point(). You may also have ventured into to the broader ggplot2 ecosystem to use geoms like geom_density_ridges() from ggridges or geom_signif() from ggsignif. But have you ever wondered how these extensions were created? Where did the authors figure out how to create a new geom? And, if the plot of your dreams doesn’t exist, how would you make your own?

Enter the exciting world of creating your own ggplot2 extensions.

The post looks a lot like a series of slides, and it takes you through the process of creating a new geom. H/T R-Bloggers.

Comments closed

Building a Pyramid in R

Published 2025-11-03 by Kevin Feasel

Tomaz Kastrun has fun generating a triangle:

What motivates human behaviour can be captured in the Maslow’s hierarchy of needs (source: Wiki). Maslow and psychologists have articulated these needs in a form of a Pyramid, and ever since the concept had been widely adopted (also criticised), and yet, another adaptation is the Pyramid of R needs

Read on for Tomaz’s take, as well as how to generate such a pyramid.

Comments closed

Using Dygraphs in R

Published 2025-10-27 by Kevin Feasel

Thomas Williams builds a chart:

I also wanted to get a little interactive with my analysis, and came across Dygraphs for R https://rstudio.github.io/dygraphs/ which wraps the “venerable” (according to creator Dan Vanderkam https://github.com/danvk) javascript charting library of the same name, first released in 2006.

I used Dygraphs in an R script file (it can work equally well in R Markdown) to quickly chart my time series data, loaded from the CSV file. Dygraphs were simple to use, are a solid pick among other charting libraries and very functional for being free and open source.

Read on for a few examples of charts, as well as the entirety of Thomas’s code.

Comments closed

Generative Additive Models for Customer Lifetime Value Estimation

Published 2025-10-15 by Kevin Feasel

Nicholas Clark builds a GAM:

I typically work in quantitative ecology and molecular epidemiology, where we use statistical models to predict species distributions or disease transmission patterns. Recently though, I had an interesting conversation with a data science PhD student who mentioned they were applying GAMs to predict Customer Lifetime Value at a SaaS startup. This caught my attention because CLV prediction, as it turns out, faces remarkably similar statistical challenges to ecological forecasting: nonlinear relationships that saturate at biological or business limits, hierarchical structures where groups behave differently, and the need to balance model flexibility with interpretability for stakeholders who need to understand why the model makes certain predictions.

This is an interesting article and I had not thought of using a GAM for calculating Customer Lifetime Value. I used a much simpler technique the one time I calculated CLV in earnest. H/T R-Bloggers.

Comments closed

Inferential Statistics in Excel using R

Published 2025-10-14 by Kevin Feasel

Adam Gladstone does a bit of inference testing:

In the previous posts in this series (Using R in Excel) I have demonstrated some basic use-cases where using R in Excel is useful. Specifically we have looked at descriptive statistics, linear regression, forecasting, and calling Python. In this post, I am going to look at inferential statistics and how R can be used (in Excel) to perform some typical statistical tests. Excel provides many excellent facilities for data wrangling and analysis. However, for certain types of statistical data analysis the limitations of the built-in functions and the Analysis ToolPak is not sufficient, and R provides superior facilities.

Read on for a few examples of tests, though there are a huge number available in R itself as well as its ecosystem of packages.

Comments closed

Simulating the Monty Hall Problem in R

Published 2025-10-06 by Kevin Feasel

Jason Bryer takes us through a classic introductory problem to Bayesian statistics:

I find that when teaching statistics (and probability) it is often helpful to simulate data first in order to get an understanding of the problem. The Monty Hall problem recently came up in a class so I implemented a function to play the game.

The Monty Hall problem results from a game show, Let’s Make a Deal, hosted by Monty Hall. In this game, the player picks one of three doors. Behind one is a car, the other two are goats. After picking a door the player is shown the contents of one of the other two doors, which because the host knows the contents, is a goat. The question to the player: Do you switch your choice?

This is one of the biggest “aha!” moments in statistics, in the sense that it is not intuitively obvious and is easy to get wrong, but once you understand why it is true, it makes reasoning over time and knowledge changes easier. H/T R-Bloggers.

Comments closed

Testing in R with testthat

Published 2025-09-26 by Kevin Feasel

Aida Gjoka writes a test:

Testing is an important step when developing code in R or any other language. If you are a Python user, you can consider reading our previous blogs in pytest. Writing tests helps us make sure that the code is working as expected. In the R ecosystem, the testthat package is one of the most used frameworks. In this blog we will explore some of the main properties of {testthat} highlighting some of the most useful functions with some examples.

Read on to see how it works. This isn’t a mocking library, but rather an assertions-based testing library. And near the end, Aida includes an extra library that helps with plot testing.

Comments closed

Updates to shapviz and kernelshap

Published 2025-09-22 by Kevin Feasel

Michael Mayer gives us some changelogs:

Our two sister packages are continuously being improved. A brief summary of the latest changes:

Click through for brief updates on both, as well as two demonstrations of the libraries in action.

Comments closed

Category: R