R – Page 77 – Curated SQL

NFL Kicker Quality

Published 2019-12-11 by Kevin Feasel

Jacob Long has an outstanding pair of posts on evaluating kickers in the NFL. FIrst up is the analysis itself:

Justin Tucker is so great that, quite frankly, it doesn’t matter which metric you use. PAA, FG% – eFG%, or just plain old FG%, he’s unlike anyone else in the past 10 years. Given the well-documented trend of increasing kicker accuracy in the NFL, I think Tucker has a solid claim on being the greatest kicker of all time.
Even with fewer seasons than many of his competitors, his PAA are double all the others who kicked in the past 10 years. He had a slightly more difficult than average set of attempts but made a higher percentage of his attempts than anyone who has had more than 22 tries. Good luck trying to find any defect in Tucker’s record.

Jacob then covers the method in detail:

Pasteur and Cunningham-Rhoads — I’ll refer to them as PC-R for short — gathered more data than most predecessors, particularly in terms of auxiliary environmental info. They have wind, temperature, and presence/absence of precipitation. They show fairly convincingly that while modeling kick distance is the most important thing, these other factors are important as well. PC-R also find the cardinal direction of every NFL stadium (i.e., does it run north-south, east-west, etc.) and use this information along with wind direction data to assess the presence of cross-winds, which are perhaps the trickiest for kickers to deal with. They can’t know about headwinds/tailwinds because as far as they (and I) can tell, nobody bothers to record which end zone teams defend at the game’s coin toss, so we don’t know without looking at video which direction the kick is going. They ultimately combine the total wind and the cross wind, suggesting they have some meaningful measurement error that makes them not accurately capture all the cross-winds. Using their logistic regressions that factor for these several factors, they calculate an eFG% and use it and its derivatives to rank the kickers.

Those wind factors make certain stadiums like New Era Field (where Buffalo plays) tricky: it’s fun to see two flags right next to each other pointing in opposite directions, or the flags on the field goal posts pointing hard right, then switching to hard left, then switching back to hard right over the course of a field goal try. H/T R-Bloggers

Comments closed

Combining SAS + R for ML

Published 2019-12-10 by Kevin Feasel

Sophia Rowland has a demo of working with SAS Cloud Analytic Service from R:

The Scripting Wrapper for Analytics Transfer, also known as SWAT, is a package that allows R users to access the power of the SAS Cloud Analytic Service (CAS) from a familiar R interface. The SWAT package is available to SAS Visual Analytics (VA), SAS Visual Statistics (VS), and SAS Visual Data Mining and Machine Learning (VDMML) users. To begin working with SWAT, download and install the package from the SAS Software GitHub page.

Read on for the demo.

Comments closed

Counting Tidyverse Package Arguments

Published 2019-12-06 by Kevin Feasel

Theo Roe has fun figuring out which tidyverse packages have the greatest number of available arguments in functions:

Before we start anything, I’d like to mention that most of the hard work came from nsaunders and his great blog post Idle thoughts lead to R internals: how to count function arguments.
Let’s get started.
The aim of this blog is to capture the number of arguments present in each function with packages of the tidyverse.

Click through to see the code, as well as some methods of visualizing the results (methods which you can use in other situations).

Comments closed

Realistic-Looking Islands with R

Published 2019-12-04 by Kevin Feasel

Holger K. von Jouanne-Diedrich uses fractal math to create realistic-looking artificial islands:

Here we will turn this principle on its head and use it to actually create realistic-looking landmasses with R. The inspiration for this came from chapter 4 “Infinite Detail” of the book “Math Bytes” by my colleague Professor T. Chartier from Davidson College in North Carolina.
The idea is to start with some very simple form, like a square, and add more detail step-by-step. Concretely, we go through every midpoint of our ever more complex polygon and shift it by a random amount. Because the polygon will be getting more and more intricate we have to adjust the absolute amount by which we shift the respective midpoints.

Click through for the code.

Comments closed

United States Maps in R

Published 2019-12-02 by Kevin Feasel

Laura Ellis shows how to use the usmap package in R:

Today, I’d like to share the package ‘usmap’ which enables incredibly easy and fast creation of US maps in R.
In honor of US Thanksgiving tomorrow, I’m going to make this blog Thanksgiving themed! In this tutorial, we will use the gTrendsR package to pull US Google search results on the keyword “thanksgiving” and plot the popularity by state.

Click through for that demo, as well as links to more demos on map usage.

Comments closed

Fun With Waffle Plots

Published 2019-11-25 by Kevin Feasel

Sebastian Sauer has a two-parter on waffle plots. The first part is an introduction:

A waffle diagram is a variant of (stacked) bar plots or pie plots. They do not have great perceptual properties, I’d suspect, but for some purposes they may be adequate. This is best explored by example. This post draws heavily from the introduction of hrbrmstr to his Waffle package.

The second part uses emojifont to show pictograms as well:

A Pictogram may be defined as a (statistical) diagram using icons or similar “iconic” graphics to illstrate stuff. The waffle plot (see this post) is a nice object where to combine waffle and pictorgrams. Originally, this post was inspired by HRBRMSTR waffle package, see this post, but I could not get it running.
Maybe the easiest way is to work through an example (spoiler: see below for what we’re heading at).

This type of plot doesn’t work for everything, but I can think of a few places where it’d be the right choice.

Comments closed

Debugging Code in R

Published 2019-11-20 by Kevin Feasel

Marina Wyss walks us through debugging techniques in R:

There are many ways to approach these problems when they arise. For example, condition handling using tools like try(), tryCatch(), and withCallingHandlers() can increase your code’s robustness by proactively steering error handling.
R also includes several advanced debugging tools that can be very helpful for quickly and efficiently locating problems, which will be the focus of this article. To illustrate, we’ll use an example adapted from an excellent paper by Roger D. Peng, and show how these tools work along with some updated ways to interact with them via RStudio. In addition to working with errors, the debugging tools can also be used on warnings by converting them to errors via options(warn = 2).

Read on for a survey of what’s available in R. It’s a lot more than writing a bunch of print statements. H/T R-Bloggers

Comments closed

Using pdqr for Statistical Uncertainty

Published 2019-11-14 by Kevin Feasel

Evgeni Chasnovski has a new CRAN package:

I am glad to announce that my latest, long written R package ‘pdqr’ is accepted to CRAN. It provides tools for creating, transforming and summarizing custom random variables with distribution functions (as base R ‘p*()’, ‘d*()’, ‘q*()’, and ‘r*()’ functions). You can read a brief overview in one of my previous posts.

Click through for a description of the package.

Comments closed

Important Assumptions with Linear Models

Published 2019-11-13 by Kevin Feasel

Sebastian Sauer takes us through two of the most important assumptions of linear models:

Additivity and linearity as the second most important assumptions in linear models
We assume that \(y\) is a linear function of the predictors. If y is not a linear function of the predictors, we cannot expect the model to deliver correct insights (predictions, causal coefficients). Let’s check an example.

Read on to understand what this means, as well as the most important assumption.

Comments closed

Updates to AzureR Packages

Published 2019-11-13 by Kevin Feasel

Hong Ooi announces changes to several AzureR packages:

AzureVM 2.1.0
You can now create VM scalesets with attached data disks. In addition, you can specify the disk type (Standard_LRS, StandardSSD_LRS, or Premium_LRS) for the OS disk and, for a Linux Data Science Virtual Machine, the supplied data disk. This enables using VM sizes that don’t support Premium storage.

Click through for the full set of updates.

Comments closed

M	T	W	T	F	S	S
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

Category: R