Press "Enter" to skip to content

Category: R

Inflation in Medieval China

Richard Vale digs into a dataset:

In this post, I would like to draw attention to a very interesting data set collected by Guan, Palma and Wu as part of the replication package for their paper The rise and fall of paper money in Yuan China, 1260-1368. The paper describes inflation, money and prices during the Yuan Dynasty era in China.

First, a little historical background.

Read on for the analysis. H/T R-Bloggers.

Leave a Comment

Building Flowcharts in R

Pau Satorra makes a chart:

Fortunately, there are several packages in R for drawing flowcharts using different approaches. The problem is that the programming is generally quite complex, and the numbers have to be entered manually or parameterized beforehand. These flowcharts can have reproducible problems because if data changes, we have to manually change the parameters again.

To make our lives easier, there’s a new {flowchart} package that uses the tidyverse workflow, which allows to create many different types of flowcharts in just a few steps.

Read on to learn more about the package. I originally thought it was based on mermaid.js based on the way the final product looked, but a quick code review has disabused me of the notion. H/T R-Bloggers.

Comments closed

Porting an R Shiny App to Observable Framework

Tim Brock makes a change:

If you’re interested in interactive data visualisation you’ve probably heard of the d3 JavaScript library, even if you’ve never used it or even know any JavaScript. Mike Bostock, the creator of d3, and colleagues followed this up with d3.express, which was quickly renamed to Observable.

Read on to see how you can build a simple Observable Framework app without spending a lot of time troubleshooting JavaScript code.

Comments closed

Prevalence Adjustment in Binary Classifiers

David Lindelöf deal with an issue in analyzing classification models:

When you run a binary classifier over a population you get an estimate of the proportion of true positives in that population. This is known as the prevalence.

But that estimate is biased, because no classifier is perfect. 

Read on to learn what this means for precision, as well as one technique for tracking prevalence changes over itme.

Comments closed

Building a QR Code Clock

Tomaz Kastrun checks what time it is:

Ever wanted to have a clock on the wall or in the office, that is not binary. But it is QR-Code clock. Well, now you can have it.

This useless R function generates new QR Code for every given period and tells the time.

Click through for the code. I could see this being useful in scenarios where you want to avoid people copying the QR code, so you embed the time in there. Then, your reader service can check to see if the time is within some valid boundary, returning an error if not.

Comments closed

Transposing Data Frames in R

Steven Sanderson does a switcharoo:

Data manipulation is a crucial skill in R programming, and one common operation is transposing data frames – converting rows to columns and vice versa. Whether you’re cleaning data for analysis, preparing datasets for visualization, or restructuring information for machine learning models, understanding how to transpose data frames efficiently is essential. This comprehensive guide will walk you through various methods to transpose data frames in R, complete with practical examples and best practices.

Read on for a few approaches to the problem.

Comments closed

Useful Tidyverse Functions

Tomaz Kastrun shares some code snippets:

Data engineering is important step that helps improve data usability, data exploration and data science. Preparing the data needs therefore needs to be done in a manner, that is easy to read, repeat and exchange between others engineers.

Tidyverse has a lot of data engineering functions, chaining different functions for getting most of your data. All six examples will show combinations of functions chained together for great result set.

Click through for those examples.

Comments closed

Mathematical Transformations of Data in R

Steven Sanderson does the math:

Data transformation is a fundamental technique in statistical analysis and data preprocessing. When working with R, understanding how to properly transform data can help meet statistical assumptions, normalize distributions, and improve the accuracy of your analyses. This comprehensive guide will walk you through implementing and visualizing the most common data transformations in R: logarithmic, square root, and cube root transformations, using only base R functions.

Click through for examples.

Comments closed

Using complete.cases in R

Steven Sanderson has no time for missing data:

Data analysis in R often involves dealing with missing values, which can significantly impact the quality of your results. The complete.cases function in R is an essential tool for handling missing data effectively. This comprehensive guide will walk you through everything you need to know about using complete.cases in R, from basic concepts to advanced applications.

Using complete.cases to find observations with missing values is great. Using it to eliminate observations with missing values can sometimes be helpful, depending on just how many missing values you have.

Comments closed