Press "Enter" to skip to content

Category: R

Interpolating Missing Values in R

Steven Sanderson fills in the blanks:

Interpolation is a method of estimating missing values based on the surrounding known values. It’s particularly useful when dealing with time series data or any dataset where the missing values are not randomly distributed.

There are various interpolation methods, but we’ll focus on linear interpolation in this article. Linear interpolation assumes a straight line between two known points and estimates the missing values along that line.

Read on to see how you can perform linear interpolation in R.

Comments closed

Generating Effect Plots in Python and R

MIchael Mayer builds some effect plots:

The plots show different types of feature effects relevant in modeling:

  • Average observed: Descriptive effect (also interesting without model).
  • Average predicted: Combined effect of all features. Also called “M Plot” (Apley 2020).
  • Partial dependence: Effect of one feature, keeping other feature values constant (Friedman 2001).
  • Number of observations or sum of case weights: Feature value distribution.
  • R only: Accumulated local effects, an alternative to partial dependence (Apley 2020).

Click through to see how they both work.

Comments closed

Using the subset() Function in R

Steven Sanderson plays duck-duck-goose with the data:

Data manipulation is a cornerstone of R programming, and selecting specific columns from data frames is one of the most common tasks analysts face. While modern tidyverse packages offer elegant solutions, Base R’s subset() function remains a powerful and efficient tool that every R programmer should master.

This comprehensive guide will walk you through everything you need to know about using subset() to manage columns in your data frames, from basic operations to advanced techniques.

Click through for a description of the function and examples of it in action.

Comments closed

Describing R Models with Tilde (~)

Steven Sanderson describes a relationship:

The tilde operator (~) in R is more than just a symbol – it’s a powerful tool that forms the backbone of statistical modeling and formula creation. Whether you’re performing regression analysis, creating statistical models, or working with data visualization, understanding the tilde operator is crucial for effective R programming.

Read on to see how it works and several examples along the way.

Comments closed

The NOT IN Operator in R

Steven Sanderson does not want these things:

In R programming, data filtering and manipulation are needed skills for any developer. One of the most useful operations you’ll frequently encounter is checking whether elements are NOT present in a given set. While R doesn’t have a built-in “NOT IN” operator like SQL, we can easily create and use this functionality. This comprehensive guide will show you how to implement and use the “NOT IN” operator effectively in R.

Read on for examples of how to use %in% and its corollary ! (...) %in%.

Comments closed

Using the OR Operator in R

Steven Sanderson asks, ketchup or catsup:

The OR operator is a fundamental component in R programming that enables you to evaluate multiple conditions simultaneously. This guide will walk you through everything from basic syntax to advanced applications, helping you master logical operations in R for effective data manipulation and analysis.

Click through for several examples.

Comments closed

Monitoring R Models in Production with Vetiver

Myles Mitchell continues a series on Vetiver:

In those blogs, we introduced the {vetiver} package and its use as a tool for streamlined MLOps. Using the {palmerpenguins} dataset as an example, we outlined the steps of training a model using {tidymodels} then converting this into a {vetiver} model. We then demonstrated the steps of versioning our trained model and deploying it into production.

Getting your first model into production is great! But it’s really only the beginning, as you will now have to carefully monitor it over time to ensure that it continues to perform as expected on the latest data. Thankfully, {vetiver} comes with a suite of functions for this exact purpose!

Click through for the full story.

Comments closed

RandomWalker 0.2.0 Release

Steven Sanderson makes an announcement:

In the ever-evolving landscape of R programming, packages continually refine their capabilities to meet the growing demands of data analysts and researchers. Today, we’re excited to announce the release of RandomWalker version 0.2.0, a minor update that brings significant enhancements to time series analysis and random walk simulations.

RandomWalker has been a go-to package for R users in finance, economics, and other fields dealing with time-dependent data. This latest release introduces new functions and improvements that promise to streamline workflows and provide deeper insights into time series data.

Read on to see what has changed.

Comments closed

Supply Chain Analysis in R via planr

Matt Dancho shows off an R package:

Supply chain management is all about balancing supply and demand to ensure that inventory levels are optimized. Overestimating demand leads to excess stock, while underestimating it causes shortages. Accurate inventory projections allow businesses to plan ahead, make data-driven decisions, and avoid costly errors like over-buying inventory or getting into a stock-outage and having no inventory to meet demand.

Read on to learn more about the package and how it works. H/T R-Bloggers.

Comments closed