R – Page 6 – Curated SQL

Monitoring R Models in Production with Vetiver

Published 2024-10-31 by Kevin Feasel

Myles Mitchell continues a series on Vetiver:

In those blogs, we introduced the {vetiver} package and its use as a tool for streamlined MLOps. Using the {palmerpenguins} dataset as an example, we outlined the steps of training a model using {tidymodels} then converting this into a {vetiver} model. We then demonstrated the steps of versioning our trained model and deploying it into production.

Getting your first model into production is great! But it’s really only the beginning, as you will now have to carefully monitor it over time to ensure that it continues to perform as expected on the latest data. Thankfully, {vetiver} comes with a suite of functions for this exact purpose!

Click through for the full story.

Comments closed

Creating Lists in R

Published 2024-10-30 by Kevin Feasel

Steven Sanderson makes a list but doesn’t check it twice:

Lists are fundamental data structures in R programming that allow you to store multiple elements of different types in a single object. This comprehensive guide will walk you through everything you need to know about creating and working with lists in R.

Click through for quite a few examples.

Comments closed

RandomWalker 0.2.0 Release

Published 2024-10-24 by Kevin Feasel

Steven Sanderson makes an announcement:

In the ever-evolving landscape of R programming, packages continually refine their capabilities to meet the growing demands of data analysts and researchers. Today, we’re excited to announce the release of RandomWalker version 0.2.0, a minor update that brings significant enhancements to time series analysis and random walk simulations.

RandomWalker has been a go-to package for R users in finance, economics, and other fields dealing with time-dependent data. This latest release introduces new functions and improvements that promise to streamline workflows and provide deeper insights into time series data.

Read on to see what has changed.

Comments closed

Supply Chain Analysis in R via planr

Published 2024-10-22 by Kevin Feasel

Matt Dancho shows off an R package:

Supply chain management is all about balancing supply and demand to ensure that inventory levels are optimized. Overestimating demand leads to excess stock, while underestimating it causes shortages. Accurate inventory projections allow businesses to plan ahead, make data-driven decisions, and avoid costly errors like over-buying inventory or getting into a stock-outage and having no inventory to meet demand.

Read on to learn more about the package and how it works. H/T R-Bloggers.

Comments closed

Looping through Column Names in R

Published 2024-10-18 by Kevin Feasel

Steven Sanderson builds a loop:

Looping through column names in R is a crucial technique for data manipulation, especially for beginners. This article will guide you through various methods to loop through column names in R, providing practical examples and insights to enhance your data analysis skills.

Read on for examples with for loops, the dynamic duo of lapply() and sapply(), and the map() function in the purrr library.

Comments closed

Adding a Suffix to a Column Name in R

Published 2024-10-15 by Kevin Feasel

Steven Sanderson renames a column or three:

When working with data frames in R, you might find yourself needing to modify column names to include additional information, such as a suffix. This can be particularly useful when merging datasets or when you want to ensure that column names are unique and descriptive.

Read on to see how.

Comments closed

Working with List Columns in R

Published 2024-10-14 by Kevin Feasel

Sebastian Sauer makes a list and checks it twice:

In this post, I want to show you how to work with list columns in R. List columns are a powerful feature of the tidyverse that allow you to store multiple objects in a single column of a data frame. This can be useful when you have a list of objects that you want to keep together, such as a list of data frames or a list of models.

Click through for a demo.

Comments closed

Combining Data Frames with Differing Columns in R

Published 2024-10-11 by Kevin Feasel

Steven Sanderson does a bit of merging:

Combining data frames is a fundamental task in data analysis, especially when dealing with datasets that have different structures. In R, there are several ways to achieve this, using base R functions, the dplyr package, and the data.table package. This guide will walk you through each method, providing examples and explanations suitable for beginner R programmers. This article will explore three primary methods in R: base R functions, dplyr, and data.table. Each method has its advantages, and understanding them will enhance your data manipulation skills.

There are quite a few examples here, depending on whether you intend to join the datasets or perform a set operation such as union or intersect.

Comments closed

Smoothing Functions in R

Published 2024-10-11 by Kevin Feasel

Ivan Svetunkov puts on the forecasting hat:

I have been asked recently by a colleague of mine how to extract the variance from a model estimated using adam() function from the smooth package in R. The problem was that that person started reading the source code of the forecast.adam() and got lost between the lines (this happens to me as well sometimes). Well, there is an easier solution, and in this post I want to summarise several methods that I have implemented in the smooth package for forecasting functions. In this post I will focus on the adam() function, although all of them work for es() and msarima() as well, and some of them work for other functions (at least as for now, for smooth v4.1.0). Also, some of them are mentioned in the Cheat sheet for adam() function of my monograph (available online).

Read on to learn more. H/T R-Bloggers.

Comments closed

Reading Parquet Files in R with nanoparquet

Published 2024-10-10 by Kevin Feasel

Stephen Turner reads some data:

In these slides I also learned about the nanoparquet package — a zero dependency package for reading and writing parquet files in R. Besides all the benefits noted above, parquet is much faster to read and write. And, as opposed to saving as .rds, parquet can easily be passed back and forth between R, Python, and other frameworks.

Let’s take a look at how reading and writing parquet files compares with CSV, either with base R or readr.

Stephen shows one of the best-case scenarios for Parquet: lots of data (100 million rows), relatively few columns, no long strings, etc. That leads to a massive improvement over using CSVs, even if you ignore the metadata and formatting benefits. I wouldn’t expect the benefits to be nearly as significant with wide text columns and very little value overlap, but that’s also pretty uncommon for the type of dataset we’re analyzing in R.

Comments closed

M	T	W	T	F	S	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31

Category: R