Category: R

Reading from and Writing to Excel with R

Published 2022-06-28 by Kevin Feasel

Benjamin Smith needs to modify an Excel file:

I was recently asked as part of a larger task to combine multiple sheets from an excel workbook into a into a single sheet. When approached about the problem I immediately was asked if I was going to use VBA to do it. While I know my way around VBA, since VBA does not have a native way to undo its operations I was uncomfortable with the potential hazard using VBA would yield if a mistake was made or something wrong happens.
In this blog I share how its possible to combine and format sheets using the openxlsx package and base R. Since I’m limiting myself to one library and base R, I will be employing base R’s pipe operator – |>, instead of the superior magrittr pipe – %>% (my opinion only, don’t take it too seriously).

Can confirm, the magrittr “default” pipe is better.

Comments closed

Building Custom ggplot2 Palettes

Published 2022-06-24 by Kevin Feasel

Nicola Rennie busts out the beret and fancy palette board:

Choosing which colours to use in a plot is an important design decision. A good choice of colour palette can highlight important aspects of your data, but a poor choice can make it impossible to interpret correctly. There are numerous colour palette R packages out there that are already compatible with {ggplot2}. For example, the {RColorBrewer} or {viridis} packages are both widely used.
If you regularly make plots at work, it’s great to have them be consistent with your company’s branding. Maybe you’re already doing this manually with the scale_colour_manual() function in {ggplot2} but it’s getting a bit tedious? Or maybe you just want your plots to look a little bit prettier? This blog post will show you how to make a basic colour palette that is compatible with {ggplot2}. It assumes you have some experience with {ggplot2} – you know your geoms from your aesthetics.

Click through to see how you can build a palette and use it across multiple ggplot2 charts.

Comments closed

Understanding the Poisson Distribution

Published 2022-06-23 by Kevin Feasel

Achim Zeileis shows off my favorite statistical distribution:

The Poisson distribution has many distinctive features, e.g., both its expectation and variance are equal and given by the parameter λλ. Thus, E(Y)=λE(Y)=λ and Var(Y)=λVar(Y)=λ. Moreover, the Poisson distribution is related to other basic probability distributions. Namely, it can be obtained as the limit of the binomial distribution when the number of attempts is high and the success probability low. Or the Poisson distribution can be approximated by a normal distribution when λλ is large. See Wikipedia (2002) for further properties and references.
Here, we leverage the distributions3 package (Hayes et al. 2022) to work with the Poisson distribution in R. In distributions3, Poisson distribution objects can be generated with the Poisson() function. Subsequently, methods for generic functions can be used print the objects; extract mean and variance; evaluate density, cumulative distribution, or quantile function; or simulate random samples.

Read on for a detailed tutorial. H/T R-bloggers.

Comments closed

Currency Conversion with priceR

Published 2022-06-20 by Kevin Feasel

Bryan Shalloway needs to make change for a trillion Zimbabwe dollars (prior to revaluation):

In this post I’ll walk through an example of how to convert between currencies. A challenge is that the conversion rate is constantly changing. If you have historical data you’ll want the conversion to be based on what the exchange rate was at the time. Hence the fields you need when doing currency conversion are:
1. Date of transaction
2. Start currency (what you’ll be converting from)
3. End currency (what you’ll be converting to)
4. Price (in units of starting currency)

Bryan also makes the smart move by memoizing the data first, as those API calls can get expensive otherwise.

Comments closed

Converting between Decimal and Binary

Published 2022-06-20 by Kevin Feasel

Tomaz Kastrun has run out of useless functions and has to create useful ones:

How does the conversion between decimal to binary or from binary to decimal behave? With another useless function, I have plotted the points (x = decimal number, y = converted binary number) on a scatter plot. Just to find out that the graph shows the binomial distribution function.

Read on for the conversion process and a fun analysis.

Comments closed

Building a Q&A Engine in R with httr and Shiny

Published 2022-06-13 by Kevin Feasel

Benjamin Smith builds an oracle but with R, not Delphi:

Knowing how to write API requests and handle their responses is a valuable skill that a developer, data
engineer or data analyst/scientist needs to know. In this short blog I share how its possible leverage DuckDuckGo’s instant answer API to create a oracle which can answer (some) of your questions using the httr package and Shiny.

Click through for a simple app which does the job.

Comments closed

Calculating a Cumulative Sum (Running Total) in R

Published 2022-06-09 by Kevin Feasel

Jim calculates running totals in R:

Cumulative Sum calculation in R, using the dplyr package in R, you can calculate the cumulative sum of a column using the following methods.

Click through for examples of ungrouped and grouped operations. H/R R-Bloggers

Comments closed

Creating Reproducible Examples with CI

Published 2022-06-03 by Kevin Feasel

Colin Gillespie and Jack Walton tackle a common training problem:

As the number of courses we offer increased, so did the maintenance burden of our associated training materials (lecture notes, slides, exercises, and more). To ease this burden, and to assist in ensuring that our training materials build consistently, we developed an R package called {jrNotes2}. Amongst other things, this package ensures that all courses:
– have identical “template files”: .gitlab-ci.yml, .gitignore, Makefiles, index.Rmd, …;
– have the same directory structure, and
– pass a set of quality-assurance checks.

This is smart but read on to see why it’s still a challenge. This is especially true in the R and Python worlds, where breaking changes seem to be so common.

Comments closed

OCR and Character Extraction with R

Published 2022-05-27 by Kevin Feasel

Benjamin Smith analyzes a text:

Since the text that I’m using has with two columns per page, the text will need to be cropped by columns before OCR is applied. Prior to that, the .pdf files will need to be converted to .png format.

Read on to see the code for the entire process, using the tidyverse, magick, and tesseract packages.

Comments closed

Web Accessibility and Shiny

Published 2022-05-20 by Kevin Feasel

Jamie Owen has a two-parter. First up, why web accessibility standards are important:

An accessible website is more than putting content online. Making a website accessible means ensuring that it can be used by as many people as possible. Accessibility standards such as the Web Content Accessibility Guidelines (WCAG) help to standardise the way in which a website can interact with assistive technologies. Allowing developers to incorporate instructions into their web applications which can be interpreted by technologies such as screen readers helps to maintain a consistent user experience for all.

Second, how Shiny apps tend to stack up:

The great thing about {shiny} is that it allows data practitioners a relatively simple, quick approach to providing an intuitive user interface to their R code via a web application. So effective is {shiny} at this job that it can be done with little to no traditional web development knowledge on the part of the developer. {shiny} and associated packages provide collections of R functions that return HTML, CSS and JavaScript which is then shipped to a browser. The variety of packages giving trivial access to styled front end components and widgets is already large and constantly growing. What this means is that R programmers can achieve a huge amount in the way of building complex, visually attractive web applications without needing to care very much about the underlying generated content that is interpreted by the browser.

As a quick spoiler, not so well. Read on for the full report.

Comments closed

M	T	W	T	F	S	S
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30	31