Picking An R Package Name

Kevin Feasel


Naming, R

Marcelo Perlin has fun looking at package names in CRAN:

Looking at package names, one strategy that I commonly observe is to use small words, a verb or noun, and add the letter R to it. A good example is dplyr. Letter d stands for dataframe, ply is just a tool, and R is, well, you know. In a conventional sense, the name of this popular tool is informative and easy to remember. As always, the extremes are never good. A couple of bad examples of package naming are A3, AF, BB and so on. Googling the package name is definitely not helpful. On the other end, packagesamplesizelogisticcasecontrol provides a lot of information but it is plain unattractive!

Another strategy that I also find interesting is developers using names that, on first sight, are completely unrelated to the purpose of the package. But, there is a not so obvious link. One example is package sandwich. At first sight, I challenge anyone to figure out what it does. This is an econometric package that computes robust standard errors in a regression model. These robust estimates are also called sandwich estimators because the formula looks like a sandwich. But, you only know that if you studied a bit of econometric theory. This strategy works because it is easier to remember things that surprise us. Another great example is package janitor. I’m sure you already suspect that it has something do to with data cleaning. And you are right! The message of the name is effortless and it works! The author even got the privilege of using letter R in the name.

Marcelo uses word and character analysis to come up with his conclusions, making this a good way of seeing how to graph and slice data. h/t R-bloggers

Related Posts


John Mount explains the vtreat package that he and Nina Zumel have put together: When attempting predictive modeling with real-world data you quicklyrun into difficulties beyond what is typically emphasized in machine learning coursework: Missing, invalid, or out of range values. Categorical variables with large sets of possible levels. Novel categorical levels discovered during test, cross-validation, or […]

Read More

R 3.4.4 Now Available

David Smith notes that R 3.4.4 is now generally available: R 3.4.4 has been released, and binaries for Windows, Mac, Linux and now available for download on CRAN. This update (codenamed “Someone to Lean On” — likely a Peanuts reference, though I couldn’t find which one with a quick search) is a minor bugfix release, and shouldn’t cause […]

Read More