Press "Enter" to skip to content

Category: R

Creating Reproducible Examples with CI

Colin Gillespie and Jack Walton tackle a common training problem:

As the number of courses we offer increased, so did the maintenance burden of our associated training materials (lecture notes, slides, exercises, and more). To ease this burden, and to assist in ensuring that our training materials build consistently, we developed an R package called {jrNotes2}. Amongst other things, this package ensures that all courses:

– have identical “template files”: .gitlab-ci.yml.gitignoreMakefiles, index.Rmd, …;

– have the same directory structure, and

– pass a set of quality-assurance checks.

This is smart but read on to see why it’s still a challenge. This is especially true in the R and Python worlds, where breaking changes seem to be so common.

Comments closed

Web Accessibility and Shiny

Jamie Owen has a two-parter. First up, why web accessibility standards are important:

An accessible website is more than putting content online. Making a website accessible means ensuring that it can be used by as many people as possible. Accessibility standards such as the Web Content Accessibility Guidelines (WCAG) help to standardise the way in which a website can interact with assistive technologies. Allowing developers to incorporate instructions into their web applications which can be interpreted by technologies such as screen readers helps to maintain a consistent user experience for all.

Second, how Shiny apps tend to stack up:

The great thing about {shiny} is that it allows data practitioners a relatively simple, quick approach to providing an intuitive user interface to their R code via a web application. So effective is {shiny} at this job that it can be done with little to no traditional web development knowledge on the part of the developer. {shiny} and associated packages provide collections of R functions that return HTML, CSS and JavaScript which is then shipped to a browser. The variety of packages giving trivial access to styled front end components and widgets is already large and constantly growing. What this means is that R programmers can achieve a huge amount in the way of building complex, visually attractive web applications without needing to care very much about the underlying generated content that is interpreted by the browser.

As a quick spoiler, not so well. Read on for the full report.

Comments closed

Importing Data into R

Sebastian Sauer shows off several ways of loading data into R:

Importing data into R can cause headaches for newbies. For some, the concept of relative and absolute paths is new. That’s why I compiled here some recommendations on how to important data into R and on how to ditch the “what’s my path” problem.

Click through for some notes. This post focused on files rather than databases, though that’s a very common way of loading data as well.

Comments closed

Movie Color Swaps in R

Mark White does some coloration switcharoos:

I also love film, and I started thinking about ways I could generate color palettes from films that use color beautifully. There are a number of packages that can generate color palettes from images in R, but I wanted to try writing the code myself.

I also wanted to not just generate a color palette from an image, but then swapping it with a different color palette from a different film. This is similar to neural style transfer with TensorFlow, but much simpler. I’m one of those people that likes to joke how OLS is undefeated; I generally praise the use of simpler models over more complex ones. So instead of a neural network, I use k-means clustering to transfer a color palette of one still frame from a film onto another frame from a different movie.

There are some interesting outcomes in the post, including a mashup of 2001: A Space Odyssey’s color scheme onto Arrival, as well as Kill Bill and Dr. Strangelove. The latter reminds me of a still from the credits sequence to a 1970s movie. H/T R-Bloggers.

Comments closed

Trying out AutoML in R

JLaw calls a timeout:

In this fourth (and hopefully final) entry in my “Icing the Kicker” series of posts, I’m going to jump back to the first post where I used tidymodels to predict whether or not a kick attempt would be iced. However, this time I see if using the h2o AutoML feature and the SuperLearner package can improve the predictive performance of my initial model.

The results are just about what I would have expected: they provide a good floor but a human with knowledge of the data and skill with techniques can still beat out-of-the-box AutoML processes. Still, knowing what that floor is can help a lot: run some AutoML tool for a few minutes/hours/days and you have an easy way of letting the business side know the expected model quality. If AutoML already exceeds expectations, you’re golden. If AutoML is close to expectations (on either end, just above or just below), you as a skilled human should be able to improve things a bit more, especially once you have a chance to analyze what the AutoML processes did. If AutoML is way below business expectations of quality, perhaps this isn’t the best project to spend time on. H/T R-Bloggers.

Comments closed

Playing with gganimate

Tomaz Kastrun tries out gganimate:

I firmly believe that animation and transition between different data states can give end-users much better insights and understanding of the data, than a single table with data points or correlation metrics.

With help of ggplot, gganimate, you can quickly create an animation based on your needs. This is a simple IRIS dataset example.

You can find more at the gganimate website. The real downside is that I don’t think it’s being maintained any longer, as the last commit was a year ago.

Comments closed

Comparing R Package Versions with Diffify

Clarissa Barratt and Parisa Gregg announce an interesting tool:

You know that sinking feeling that you get when you’re months into a big project and you log in one day and nothing works? Turns out something has updated and things have been removed that you needed and now you need to spend hours-days figuring out what’s changed and your masters deadline is getting closer and … ok, apparently this took me back to a very specific event.

But I’m sure most of that sounds familiar to you if you’ve ever programmed something over a longer period of time.

Over the last few months, Jumping Rivers have been working on a tool that will make it easier to see differences between R package versions: Diffify.

Read on to see it in action. It looks quite useful for troubleshooting issues in which a package suddenly changed API functionality, something which tends to happen frequently in the R and Python worlds.

Comments closed

Combining flashlight and plotly in R

Michael Mayer analyzes candidate models:

Since almost all plots in flashlight are constructed with ggplot, it is super easy to turn them into interactive plotly objects: just add a simple ggplotly() to the end of the call.

However… it is not straightforward to show interactive plots in a blog! Thus, we show only screenshots of the resulting plots here and refer to the complete HTML report here: https://mayer79.github.io/flashlight_plotly/flashlight_plotly.html

We will use a sweet dataset with more than 20’000 houses to model house prices by a set of derived features such as the logarithmic living area. The location will be represented by the postal code.

Click through for the blog post or check out the report.

Comments closed