Press "Enter" to skip to content

Category: R

Using complete.cases in R

Steven Sanderson has no time for missing data:

Data analysis in R often involves dealing with missing values, which can significantly impact the quality of your results. The complete.cases function in R is an essential tool for handling missing data effectively. This comprehensive guide will walk you through everything you need to know about using complete.cases in R, from basic concepts to advanced applications.

Using complete.cases to find observations with missing values is great. Using it to eliminate observations with missing values can sometimes be helpful, depending on just how many missing values you have.

Leave a Comment

Using na.rm in R

Steven Sanderson handles missing information in the best way possible—by ignoring it:

Missing values are a common challenge in data analysis, and R provides robust tools for handling them. The na.rm parameter is one of R’s most essential features for managing NA values in your data. This comprehensive guide will walk you through everything you need to know about using na.rm effectively in your R programming journey.

Read on for several examples of how na.rm works.

Leave a Comment

The Posit Package Manager and diffify

Colin Gillespie and Myles Mitchell share some updates:

The latest release of Posit Package Manager introduces several enhancements, including:

  • Python Git Builders: Build Python packages (wheels) directly from Git.
  • Blocklists: Easily block specific packages or versions.
  • Improved Documentation: Clearer and more accessible information.

Read on for one more big change to Posit Package Manager, as well as how diffify fits into the mix.

Leave a Comment

Finding the Column with Max Value in R

Steven Sanderson finds the column with the maximum value for each row in an R data frame:

Finding the column with the maximum value for each row is a useful operation when you want to identify the dominant category, highest measurement, or most significant feature in your dataset. This can provide valuable insights and help in decision-making processes.

R offers several ways to accomplish this task, ranging from base R functions to powerful packages like dplyr and data.table. We’ll explore each approach in detail, providing code examples and explanations along the way.

Click through for several examples.

Leave a Comment

Comparing Positron to RStudio

Theo Roe performs a product comparison:

Positron is the new beta Data Science IDE from Posit. Though Posit have stressed that maintenance and development of RStudio will continue, I want to use this blog to explore if Positron is worth the switch. I’m coming at this from the R development side but there will of course be some nuances from other languages in use within Positron that require some thought.

Read on for Theo’s perspective. Knowing that it’s using the same underlying framework as Visual Studio Code, I kind of wish this were an extension for VS Code rather than a separate app.

1 Comment

Finding Columns in R with No Data

Steven Sanderson looks for the missing columns:

When working with real-world datasets in R, it’s common to encounter missing values, often represented as NA. These missing values can impact the quality and reliability of your analyses. One important step in data preprocessing is identifying columns that consist entirely of missing values. By detecting these columns, you can decide whether to remove them or take appropriate action based on your specific use case. In this article, we’ll explore how to find columns with all missing values using base R functions.

Click through to see how you can do this. It’s not quite as simple as missing rows (complete_cases()) but it’s also not too much of an ordeal, either.

Leave a Comment

Interpolating Missing Values in R

Steven Sanderson fills in the blanks:

Interpolation is a method of estimating missing values based on the surrounding known values. It’s particularly useful when dealing with time series data or any dataset where the missing values are not randomly distributed.

There are various interpolation methods, but we’ll focus on linear interpolation in this article. Linear interpolation assumes a straight line between two known points and estimates the missing values along that line.

Read on to see how you can perform linear interpolation in R.

Leave a Comment

Generating Effect Plots in Python and R

MIchael Mayer builds some effect plots:

The plots show different types of feature effects relevant in modeling:

  • Average observed: Descriptive effect (also interesting without model).
  • Average predicted: Combined effect of all features. Also called “M Plot” (Apley 2020).
  • Partial dependence: Effect of one feature, keeping other feature values constant (Friedman 2001).
  • Number of observations or sum of case weights: Feature value distribution.
  • R only: Accumulated local effects, an alternative to partial dependence (Apley 2020).

Click through to see how they both work.

Comments closed

Using the subset() Function in R

Steven Sanderson plays duck-duck-goose with the data:

Data manipulation is a cornerstone of R programming, and selecting specific columns from data frames is one of the most common tasks analysts face. While modern tidyverse packages offer elegant solutions, Base R’s subset() function remains a powerful and efficient tool that every R programmer should master.

This comprehensive guide will walk you through everything you need to know about using subset() to manage columns in your data frames, from basic operations to advanced techniques.

Click through for a description of the function and examples of it in action.

Comments closed

Describing R Models with Tilde (~)

Steven Sanderson describes a relationship:

The tilde operator (~) in R is more than just a symbol – it’s a powerful tool that forms the backbone of statistical modeling and formula creation. Whether you’re performing regression analysis, creating statistical models, or working with data visualization, understanding the tilde operator is crucial for effective R programming.

Read on to see how it works and several examples along the way.

Comments closed