R – Page 62 – Curated SQL

Recently, I have been getting a few comments on my old article on image classification with Keras, saying that they are getting errors with the code. And I have also gotten a few questions about how to use a Keras model to predict on new images (of different size). Instead of replying to them all individually, I decided to write this updated version using recent Keras and TensorFlow versions (all package versions and system information can be found at the bottom of this article, as usual).

Click through for the R code.

Comments closed

R: Avoid apply() with Large Datasets

Published 2020-09-08 by Kevin Feasel

George Pipis walks us through the performance cost of using the apply() function against a large dataset in R:

When we are dealing with large datasets and there is a need to calculate some values like the row/column min/max/rank/mean etc we should avoid the apply function because it takes a lot of time. Instead, we can use the matrixStats package and its corresponding functions. Let’s provide some comparisons.

Click through for a demonstration of how much slower it can be in certain cases. H/T R-Bloggers.

Comments closed

Filtering with dplyr Using Strings as Expressions

Published 2020-09-04 by Kevin Feasel

Kenneth Tay shows how to build arbitrary expressions to execute in dplyr’s filter function:

This took me a while to figure out and so I thought I would post this as future reference. Let’s say I have the mtcars data and I want to filter for just the rows with cyl == 6. I would do something like this:
library(tidyverse)
data(mtcars)
mtcars %>% filter(cyl == 6)
What if I had the filter condition as a string instead?

Read on to see how you can do this. Given the answer, I wonder if there’s any chance that could be turned into an injection vulnerability. H/T R-Bloggers

Comments closed

The State of R Packages Locally

Published 2020-09-04 by Kevin Feasel

Maelle Salmon and Gabor Csardi walk us through some details about installed packages on a system:

Now how do you know where any of your installed packages was installed? You can use find.package() and path.package()!
To check whether a package is installed, it is better to use find.package() than installed.packages() because the latter, as its docs state, can be slow on some systems. In both cases, it does not mean the package is usable, for that you’d need to use library() or require().

Read on for several tips around where packages are located, what their contents look like, and learning a bit more about the actual code in packages.

Comments closed

Error Handling in R with purrr

Published 2020-09-02 by Kevin Feasel

Ariel Muldoon shows off a couple of helpful functions in R’s purrr library:

The possibly() function is a wrapper function. It wraps around an existing function. Other than defining the function to wrap, the main argument of interest is otherwise. In otherwise we define what value to return if we get an error from the function we are wrapping.

Read the whole thing. H/T R-Bloggers.

Comments closed

Using oysteR to Track Security Vulnerabilities in R Packages

Published 2020-09-01 by Kevin Feasel

Colin Gillespie walks us through using the oysteR package:

The {oysteR} package is an R interface to the OSS Index that allows users to scan their installed R packages. A few months ago, I stumbled across a fledgeling version of this package and decided to make a few contributions to help move the package from GitHub to CRAN. A few PRs later, I’m now a co-author and the package is on CRAN.

Click through for a demo.

Comments closed

MLOPS in R with GitHub Actions

Published 2020-08-26 by Kevin Feasel

David Smith explains MLOPS and GitHub actions in a talk:

In the talk, I demonstrate the process in action (the demo starts at the 14:30 mark in the video below). I used Visual Studio Code to edit the app.R file in repository, and then pushed the changes to GitHub. That immediately triggered the action to deploy the updated file via SSH to the Shiny Server, running in a remote VM. Similarly, changes to the data file or to the R script files implementing the logistic regression model would trigger the model to be retrained in the cluster, and re-deploy the endpoint to deliver new predictions from the updated model.

Click through for a quick summary, link to the repo, and embedded video of the talk.

Comments closed

Creating Hand-Drawn Networks in R

Published 2020-08-25 by Kevin Feasel

David Schoch announces a new R package:

This post introduces the R package rougnet, which wraps the java script library rough.js to draw sketchy, hand-drawn-like networks. From all my network packages, this is probably the most useless but most fun one I developed.

Click through to check it out. H/T R-bloggers.

Comments closed

Explaining the ROC Plot

Published 2020-08-25 by Kevin Feasel

Nina Zumel takes us through what each element of a ROC curve means:

In our data science teaching, we present the ROC plot (and the area under the curve of the plot, or AUC) as a useful tool for evaluating score-based classifier models, as well as for comparing multiple such models. The ROC is informative and useful, but it’s also perhaps overly concise for a beginner. This leads to a lot of questions from the students: what does the ROC tell us about a model? Why is a bigger AUC better? What does it all mean?

Read on for the answer.

Comments closed

Fun with Benford’s Law

Published 2020-08-19 by Kevin Feasel

Nagdev Amruthnath covers a topic which brings me joy:

Benford’s Law is one of the most underrated and widely used techniques that are commonly used in various applications. United States IRS neither confirms nor denies their use of Benford’s law to detect any number of manipulations in income tax filing. Across the Atlantic, the EU is very open and proudly claims its use of Benford’s law. Today, this is widely used in accounting to detect any fraud. Nigrini, a professor at the University of Cape Town, also used this law to identify financial discrepancies in Enron’s financial statement. In another case, Jennifer Golbeck, a professor at the University of Maryland, was able to identify bot accounts on twitter using Benford’s law. Xiaoyu Wang from the University of Winnipeg even published a report on how to use Benford’s law on images. In the rest of this article, we will take about Benford’s law and how it can be applied using R.

The applications to images and music were new to me. Very cool. H/T R-Bloggers

Comments closed

M	T	W	T	F	S	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31

Category: R

Image Classification with Keras and TensorFlow 2 in R

R: Avoid apply() with Large Datasets

Filtering with dplyr Using Strings as Expressions

The State of R Packages Locally

Error Handling in R with purrr

Using oysteR to Track Security Vulnerabilities in R Packages

MLOPS in R with GitHub Actions

Creating Hand-Drawn Networks in R

Explaining the ROC Plot

Fun with Benford’s Law