Press "Enter" to skip to content

Category: R

Trying the sample() Function in R

Steven Sanderson gathers a sample:

Sampling is a fundamental technique in data analysis and statistical modeling. It allows us to draw meaningful insights and make inferences about a larger population based on a representative subset. In the world of R programming, the sample() function stands as a versatile tool that enables us to create random samples efficiently. In this post, we will explore the sample() function and its various applications through a series of plain English examples.

Click through for those examples.

Comments closed

Creating Crosstabs in R

Steven Sanderson shows off the ability to perform crosstabs in R:

As a programmer, you’re constantly faced with the task of organizing and analyzing data. One powerful tool in your R arsenal is the xtabs() function. In this blog post, we’ll explore the versatility and simplicity of xtabs() for aggregating data. We’ll use the mtcars dataset and the healthyR.data::healthyR_data dataset to illustrate its functionality. Get ready to dive into the world of data aggregation with xtabs()!

Click through for two examples of it creating a crosstab (or, in SSRS/Power BI terms, a matrix) from your data.

Comments closed

Calculating Time Series Differences

Steven Sanderson notices the difference:

The diff() function in R calculates the differences between consecutive elements in a vector or a time series. It takes a single argument, which is the input vector, and returns a new vector with the differences. This function is particularly useful for analyzing the rate of change, identifying patterns, and detecting anomalies in your data. 

Read on to see how you can use it, as well as some examples of usage.

Comments closed

Simplifying Model Formulas in R

Steven Sanderson shows off a command:

As a programmer, you may come across various scenarios where you need to create complex model formulas in R. However, constructing these formulas can often be challenging and time-consuming. This is where the ‘reformulate()’ function comes to the rescue! In this blog post, we will explore the purpose and usage of the reformulate() function in R, and provide you with simple examples to help you grasp its power.

This is where I want to put on glasses so I can push the glasses up my nose and say “Well, ackshually, the examples here are pre-formulating instead of re-formulating…”

Silliness on my part aside, click through to see how the function works and how it can work to simplify your regression analyses.

Comments closed

Listing Files by Date in R

Steven Sanderson shows off a built-in function:

In R, the file.info() function is a useful tool for retrieving file information, such as file attributes and metadata. It allows programmers to gather details about files, including their size, permissions, and timestamps. In this post, we will explore the file.info() function and demonstrate how it can be used to list files by date.

Read on for more information. This function is a lot more powerful than simply running ls or dir (without any flags) in a directory.

Comments closed

Shiny Apps and Fullscreen Behavior

Tim Brock gives us a demo:

Browsers have been implementing variations on a JavaScript fullscreen API for over a decade. Unfortunately, for much of that time the APIs varied across browsers. This made actually using it in production somewhat cumbersome.

Finally, with the release of Safari 16.4 in March of this year, the latest versions of all major desktop browsers now support a single, standardized interface. Legacy versions of Safari for desktop are still in use and there’s still no support at all for the Fullscreen API on iPhones; so while you can cover most users with the standardized API, it should still be for progressive enhancement and not as a fundamental requirement for operation of an application.

Click through for the script.

Comments closed

Using SHAP to Gauge Geographic Effects in R or Python

Michael Mayer runs an analysis:

This is the next article in our series “Lost in Translation between R and Python”. The aim of this series is to provide high-quality R and Python code to achieve some non-trivial tasks. If you are to learn R, check out the R tab below. Similarly, if you are to learn Python, the Python tab will be your friend.

This post is heavily based on the new {shapviz} vignette.

I appreciate the effort to include both R and Python code in this analysis, and recommend you peruse both sets of code listings.

Comments closed

Unpivoting Data in R

Steven Sanderson shows off a function with a slightly confusing name:

In the world of data analysis and manipulation, tidying and reshaping data is often an essential step. R’s tidyr library provides powerful tools to efficiently transform and reshape data. One such function is pivot_longer(). In this blog post, we’ll explore how pivot_longer() works and demonstrate its usage through several examples. By the end, you’ll have a solid understanding of how to use this function to make your data more manageable and insightful.

The slight confusion is that this function is really unpivoting rather than pivoting. In R terminology, a pivot takes you from longer data to wider data: it uses some function to convert data from N rows * M columns to n*m, where N > n and M < m. Unpivoting does the opposite and I personally like that terminology better than “pivot wider” or “pivot longer.”

Comments closed

Order, Sort, and Rank in R

Steven Sanderson compares three functions in R:

In the realm of data analysis and programming, organizing and sorting data efficiently is crucial. In R, a programming language renowned for its data manipulation capabilities, we have three powerful functions at our disposal: order()sort(), and rank(). In this blog post, we will delve into the intricacies of these functions, explore their applications, and understand their parameters. These R functions are all used to sort data, however, they each have different purposes and use different methods to sort the data.

Coming at R from a SQL background, the idea of order() and sort() behaving so differently was strange at first, especially because the verbs are synonymous and it’s the noun form of “order” which we get back, rather than performing the ordering.

Comments closed

Date Handling in Excel and R

Amieroh Abrahams continues a series comparing Excel and R:

Here we will explore the various ways to handle dates in Excel and R. Dates are a crucial part of data analysis and are used in various fields such as biology, healthcare, and social sciences. However, working with dates can be challenging, especially when dealing with large datasets or multiple formats.

In Excel, there are several functions available to handle dates, such as DATEYEARMONTH, and DAY. Excel also provides various formatting options to customise the display of dates. However, Excel has some limitations when it comes to complex date calculations, and it can be time-consuming to work with dates in large datasets.

In contrast, R has a robust set of tools for handling dates, including the {lubridate} package, which simplifies the manipulation of dates and times. Additionally, R allows for efficient handling of dates in large datasets, making it a powerful tool for time-series analysis. Whether you are working with dates in Excel or R, this blog will provide you with the basic tools and techniques to handle dates efficiently and accurately. So let’s get started!

Read on for the comparison.

Comments closed