R – Page 44 – Curated SQL

Extracting Numbers from a Stacked Density Plot

Published 2022-07-18 by Kevin Feasel

A month or so ago, I found a graph showing a percentage of PCs having a given range of memory installed, between March 2000 and April 2020, on a TechTalk page of PC Matic; it had the form of a stacked density plot. This kind of installed memory data is rare, how could I get the underlying values (a previous post covers extracting data from a heatmap)?

Read on for an interesting attempt at reverse-engineering the original numbers used to create an image. H/T R-Bloggers.

Comments closed

Mapping Income vs Rent in Counties

Published 2022-07-18 by Kevin Feasel

Rick Pack updates a package to support a project:

I am happy to announce a contribution to the biscale package that makes printing shorter labels using SI prefixes (e.g., 1,000,003 => 1M and 1,324 => 1.3k) far easier. This makes printing the legend in an attractive easier, although you can tell by the picture above that I still struggle with optimal uses of the cowplot package’s draw_plot(). I would love for the legend and map to be centered under the title.
The new si_levels argument for bi_class_breaks() takes a logical value of TRUE or FALSE for either a single or two-unit vector, with a single unit vector causing the specified value to be applied to both the X and Y variables. This matches Prener’s convenient functionality for the number of digits function dig_lab, as he requested in the Github Issue I created for this addition. Note that si_levels rounds the input number, if appropriate, based on the digits indicated by dig_lab, which defaults to 3.

Click through to get access to the update, as well as to see some of the visuals Rick put together with it.

1 Comment

Tips for the Tidyverse

Published 2022-07-18 by Kevin Feasel

Tomaz Kastrun shares some advice:

Tidyverse provides a handful of great functions for operating across multiple columns simultaneously. Across is a function, that makes it easy to apply the same transformation over numerous columns in summarise() and mutate() functions.
Across accepts two arguments; a) array of columns and b) function or list of functions to be applied to selected columns.

Check out eight tips for working with packages in R’s tidyverse.

Comments closed

shapviz Package Updates

Published 2022-07-13 by Kevin Feasel

Michael Mayer announces updates to shapviz:

In a recent post, I introduced the initial version of the “shapviz” package. Its motto: do one thing, but do it well: visualize SHAP values.
The initial community feedback was very positive, and a couple of things have been improved in version 0.2.0. Here the main changes:

Read on for those changes.

Comments closed

Mapping Functions in R with purrr

Published 2022-07-13 by Kevin Feasel

Ronan Harrington executes a function for each row in a dataframe:

In this section, we want to tidy the different types of flight in the data set by increasing the number of rows and decreasing the number of columns. For a given airport on a given day, instead of having multiple columns/variables for arrivals, departures and total number of flights, we want to have one column describing the flight type (e.g. arrival or departure) and one column with the value of that flight type/number of flights. This will give the data set a tidy structure.

Click through for the process and the script. H/T R-Bloggers.

Comments closed

Recreating a Shiny App with Plumber and ReactJS

Published 2022-07-08 by Kevin Feasel

Liam Kalita starts a new series:

Being able to host static content on RStudio Connect means we can host ReactJS applications on the platform. React is a great framework for developing web applications, with a lot of power and flexibility when creating user interfaces. Separating {shiny} applications into a user interface and a data processing API has its advantages.
In this blog series, we will guide you through creating the application from the RStudio tutorial for creating a {shiny} app, except we’ll be attempting it using ReactJS and an R {plumber} API instead of {shiny}. In this blog, part 1, we will be introducing you to the technologies we will need for the tutorial.

Read on for the essentials of what plumber and ReactJS are and why you might use each of them.

Comments closed

Reading from and Writing to Excel with R

Published 2022-06-28 by Kevin Feasel

Benjamin Smith needs to modify an Excel file:

I was recently asked as part of a larger task to combine multiple sheets from an excel workbook into a into a single sheet. When approached about the problem I immediately was asked if I was going to use VBA to do it. While I know my way around VBA, since VBA does not have a native way to undo its operations I was uncomfortable with the potential hazard using VBA would yield if a mistake was made or something wrong happens.
In this blog I share how its possible to combine and format sheets using the openxlsx package and base R. Since I’m limiting myself to one library and base R, I will be employing base R’s pipe operator – |>, instead of the superior magrittr pipe – %>% (my opinion only, don’t take it too seriously).

Can confirm, the magrittr “default” pipe is better.

Comments closed

Building Custom ggplot2 Palettes

Published 2022-06-24 by Kevin Feasel

Nicola Rennie busts out the beret and fancy palette board:

Choosing which colours to use in a plot is an important design decision. A good choice of colour palette can highlight important aspects of your data, but a poor choice can make it impossible to interpret correctly. There are numerous colour palette R packages out there that are already compatible with {ggplot2}. For example, the {RColorBrewer} or {viridis} packages are both widely used.
If you regularly make plots at work, it’s great to have them be consistent with your company’s branding. Maybe you’re already doing this manually with the scale_colour_manual() function in {ggplot2} but it’s getting a bit tedious? Or maybe you just want your plots to look a little bit prettier? This blog post will show you how to make a basic colour palette that is compatible with {ggplot2}. It assumes you have some experience with {ggplot2} – you know your geoms from your aesthetics.

Click through to see how you can build a palette and use it across multiple ggplot2 charts.

Comments closed

Understanding the Poisson Distribution

Published 2022-06-23 by Kevin Feasel

Achim Zeileis shows off my favorite statistical distribution:

The Poisson distribution has many distinctive features, e.g., both its expectation and variance are equal and given by the parameter λλ. Thus, E(Y)=λE(Y)=λ and Var(Y)=λVar(Y)=λ. Moreover, the Poisson distribution is related to other basic probability distributions. Namely, it can be obtained as the limit of the binomial distribution when the number of attempts is high and the success probability low. Or the Poisson distribution can be approximated by a normal distribution when λλ is large. See Wikipedia (2002) for further properties and references.
Here, we leverage the distributions3 package (Hayes et al. 2022) to work with the Poisson distribution in R. In distributions3, Poisson distribution objects can be generated with the Poisson() function. Subsequently, methods for generic functions can be used print the objects; extract mean and variance; evaluate density, cumulative distribution, or quantile function; or simulate random samples.

Read on for a detailed tutorial. H/T R-bloggers.

Comments closed

Currency Conversion with priceR

Published 2022-06-20 by Kevin Feasel

Bryan Shalloway needs to make change for a trillion Zimbabwe dollars (prior to revaluation):

In this post I’ll walk through an example of how to convert between currencies. A challenge is that the conversion rate is constantly changing. If you have historical data you’ll want the conversion to be based on what the exchange rate was at the time. Hence the fields you need when doing currency conversion are:
1. Date of transaction
2. Start currency (what you’ll be converting from)
3. End currency (what you’ll be converting to)
4. Price (in units of starting currency)

Bryan also makes the smart move by memoizing the data first, as those API calls can get expensive otherwise.

Comments closed

M	T	W	T	F	S	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31

Category: R