Press "Enter" to skip to content

Category: R

Tracking Home Heating Oil Prices in R

Steven Sanderson charts some prices:

If you live in New York and rely on heating oil to keep your home warm during the colder months, you know how important it is to keep track of heating oil prices. Fortunately, with a bit of R code, you can easily access the latest heating oil prices in New York.

The code uses the {dplyr} package to clean and manipulate the data, as well as the {timetk} package to plot the time series.

Read on for an overview of what the code does, followed by the code itself and a time series plot at the end.

Comments closed

pmap and imap Examples in purrr

Steven Sanderson has a multi-parter for us. First up is a look at the pmap() function in R’s purrr library:

The pmap() function in R is part of the purrr library, which is a package designed to make it easier to work with functions that operate on vectors, lists, and other types of data structures.

The pmap() function is used to apply a function to a list of arguments, where each element in the list contains the arguments for a single function call. The function is applied in parallel, meaning that each call is executed concurrently, which can help speed up computations when working with large datasets.

Next up is the imap() function:

The imap() function is a powerful tool for iterating over a list or a vector while also keeping track of the index or names of the elements. This function applies a given function to each element of a list, along with the name or index of that element, and returns a new list with the results.

The imap() function takes two main arguments: x and .fx is the list or vector to iterate over, and .f is the function to apply to each element. The .f function takes two arguments: x and i, where x is the value of the element and i is the index or name of the element.

Both of these sound a little complex and abstract at first, though as you get more familiar with them, you get to see just how powerful they are.

Comments closed

Content Security Policies and Posit Connect Apps

Theo Roe gets into some web security:

Heads up! We’re about to launch WASP, a Web Application Security Platform. The aim of WASP is to help you manage (well, you guessed it) the security of your Posit Connect application using Content Security Policy and Network Error Logging. More details soon, but if this interests you, please get in touch.


This blog post is aimed at those who are somewhat tech literate but not necessarily a security expert. We’re aiming to introduce the concept of Content Security Policy and teach some of the technical aspects.

This does provide a nice overview to the topic and explains the key “what” and “why” answers.

Comments closed

Generating Nested Time Series Models

Steven Sanderson can’t stop at just one time series:

There are many approaches to modeling time series data in R. One of the types of data that we might come across is a nested time series. This means the data is grouped simply by one or more keys. There are many methods in which to accomplish this task. This will be a quick post, but if you want a longer more detailed and quite frankly well written out one, then this is a really good article

The quick post doesn’t include a lot of commentary but does show the code you’d use for the operation.

Comments closed

An Intro to R for the Excel User

Amieroh Abrahams explains some of the benefits of R:

The era of data manipulation and analysis using programming languages has arrived. But it can be tough to find the time and the right resources to fully switch over from more manual, time-consuming solutions, such as Excel. In this blog we will show a comparison between Excel and R to get you started!

When choosing between R and Excel, it is important to understand how both solutions can get you the results you need. However, one can make it an easy, reputable, convenient process, whereas the other can make it an extremely frustrating, time-consuming process prone to human errors.

I like this post as a way of showing current Excel users how R can perform a variety of tasks programmatically which they might do manually, though the it probably beats up on Excel too much. There’s a good reason why Excel is the single most important business tool out there and people who are deep into Excel can always break out DAX or M to perform operations.

Comments closed

Calibrating and Plotting a Time Series with healthyR.ts

Steven Sanderson builds a plot:

In time series analysis, it is common to split the data into training and testing sets to evaluate the accuracy of a model. However, it is important to ensure that the model is calibrated on the training set before evaluating its performance on the testing set. The {healthyR.ts} library provides a function called calibrate_and_plot() that simplifies this process.

Click through for the function’s input parameters and an example of how to use it.

Comments closed

Matrix Multiplication in R with DuckDB and SQLite

Karsten Weinert compares two databases:

On my laptop with 16 GB RAM, I would like to perform a matrix-vector multiplication with a sparse matrix of around 10 million columns and 2500 rows. The matrix has approximately only 2% non-zero entries, but this are still 500 million numbers and the column/row information, too large to work comfortably in-memory.

A while ago, I tried using sqlite for this task. It kind of worked, but was too slow to be useful. This weekend, I revisited the problem and tried using duckdb.

Read on for the results. I’ve heard enough positives about DuckDB over the past few weeks that it makes me want to try it out. H/T R-Bloggers.

Comments closed

tidyAML Now Available in CRAN

Steven Sanderson has a package make the big-time:

I’m excited to announce that the R package {tidyAML} is now officially available on CRAN! This package is designed to make it easy for users to perform automated machine learning (AutoML) using the tidymodels ecosystem. With a simple and intuitive interface, tidyAML allows users to quickly generate high-quality machine learning models without worrying about the underlying details.

Read on to learn more more about this package, as well as the broader healthyverse series of packages.

Comments closed

Visualizing Moving Averages in R with healthyR.ts

Steven Sanderson shows off a useful R library:

Are you interested in visualizing time series data in a clear and concise way? The R package {healthyR.ts} provides a variety of tools for time series analysis and visualization, including the ts_ma_plot() function.

The ts_ma_plot() function is designed to help you quickly and easily create moving average plots for time series data. This function takes several arguments, including the data you want to visualize, the date column from your data, the value column from your data, and the frequency of the aggregation.

Read on to learn more about this plot and see an example of it in action.

Comments closed