Press "Enter" to skip to content

Category: R

NYC Open Data R Package

Antoine Soetewey announces a package:

I am pleased to announce the release of nycOpenData, an R package providing convenient, tidy access to dozens of datasets from the New York City Open Data platform.

The package is designed as part of an open-science and reproducible-research effort, with the goal of lowering the friction between public data and statistical analysis—especially for teaching, exploratory research, and applied civic work.

It is available on CRAN, so it should be easy to grab. H/T R-Bloggers.

Leave a Comment

Operating on Distributions in R with distionary

Vincenzo Cola announces a new R package:

After passing through rOpenSci peer review, the distionary package is now newly available on CRAN. It allows you to make probability distributions quickly – either from a few inputs or from its built-in library – and then probe them in detail.

These distributions form the building blocks that piece together advanced statistical models with the wider probaverse ecosystem, which is built to release modelers from low-level coding so production pipelines stay human-friendly. Right now, the other probaverse packages are distplyr, allowing you to morph distributions into new forms, and famish, allowing you to tune distributions to data. Developed with risk analysis use cases like climate and insurance in mind, the same tools translate smoothly to simulations, teaching, and other applied settings.

Click through for an overview of the package.

Leave a Comment

Randomly Moving the Mouse Cursor in R

Tomaz Kastrun has been so busy, his screensaver never comes on, even when he’s out at lunch:\

New R Package called LazyMouse with single function for randomly moving mouse cursor in your favorite R IDE.

For every R developer, R data scientists and all those everyday R users, that also need a break and do not want the computer to go into sleep mode.

Read on to see how it works. And jokes aside, there have been times in which I’ve wanted something like this to keep the screen from locking up or drives going to sleep when running heavy work overnight on a device I can physically control (i.e., not a workstation I’m leaving on at the office).

Comments closed

Draw Economist-Style Graphs in R

Ozancan Ozdemir replicates a style:

I think everyone agrees on the fact that the Economist magazine produces very-well designed graphics, sometimes the best in the world. The success behind their graph lies on the ability of explaining complex matters in a simpler way by employing traditional data visualization techniques such as line graph or bar plot. They put emphasis on the message they want to convey rather than the aesthetics of the graph itself. They also have a clear hiearchy in their plots and use colors, fonts and lines which represents the brand identity of the magazine.

In this tutorial, we are going to create an Economist-style graph in R by using ggplot2ggthemesshowtextggtextand grid packages. I am going to use a dataset that I have been collecting since 2014 about the poverty line and minimum wage in Turkey, but you can adopt these codes to any dataset you want to visualize.

Click through to learn how.

Comments closed

Exploring Associations in R with AssociationExplorer

Antoine Soetewey announces a new tool:

I am pleased to announce the publication of our paper “AssociationExplorer: A user-friendly Shiny application for exploring associations and visual patterns” in the journal SoftwareX, together with the official release of the AssociationExplorer2 R package on CRAN.

Both the paper and the software are part of an open-science effort aimed at making exploratory data analysis more accessible to non-technical users.

Read on to learn more about the tool and how you can get it. H/T R-Bloggers.

Comments closed

Using Haskell for Data Science

Jonathan Carroll has my attention:

I’ve been learning Haskell for a few years now and I am really liking a lot of the features, not least the strong typing and functional approach. I thought it was lacking some of the things I missed from R until I found the dataHaskell (www.datahaskell.org) project.

There have been several attempts recently to enhance R with some strong types, e.g.  vapour (vapour.run), typr (github.com), using {rlang}’s checks (josiahparry.com), and even discussions about implementations at the core level e.g.  in September 2025 (stat.ethz.ch) continued in November 2025 (stat.ethz.ch). While these try to bend R towards types, perhaps an all-in solution makes more sense.

In this post I’ll demonstrate some of the features and explain why I think it makes for a good (great?) data science language.

I’ve been a big fan of F# for data science work as well for similar reasons, so it was interesting to read this article on Haskell. H/T R-Bloggers.

Comments closed

Getting ML Services Running on SQL Server 2025

Greg Low takes a look at ML Services:

This is an update of a post that I wrote for SQL Server 2022 . Unfortunately, those instructions needed to be updated, not because anything notable has changed in SQL Server 2025, but because the recent distribution of Python has changed. Thanks to Peter Bishop for reporting what was now missing.

I hope that the versions Greg mentions—R 4.2 and Python 3.10—aren’t the latest that SQL Server supports, because those are both woefully out of date. Python 3.10 came out almost 4 years ago and R 4.2 is almost 3 years old at this point.

Comments closed

Accessing Home Assistant’s InfluxDB Instance from R

Martin Stingl looks for some data:

I’m running a HomeAssistant instance at home. I’ve configured it to log data into an InfluxDB database, so I can retrieve historical data for analysis later on. In default mode HomeAssistant would aggregate historical data for storage reasons.

So now I want to access the InfluxDB database from R to perform custom analyses. HomeAssistant is still using InfluxDB version 1. To connect to InfluxDB from R, I thought I can use the influxdbr package. But I got some errors because this package seems to be outdated.

Read on for the error message and how Martin was able to get around this. H/T R-Bloggers.

Comments closed

Finding Substrings in Pi

Tomaz Kastrun has some fun with Pi:

We will do this in the following steps (for the word “eggs”):

  1. Encode the word EGGS to numbers. E = 5, G = 7, G = 7 and S = 19. Together concatenated we get the string of 57719.
  2. We store a veeery long string of PI number.
  3. Start looking in PI number for substring of “57719”.

Click through for the R code.

Comments closed

Creating Your Own ggplot2 Geom

Isabella Velasquez is feeling creative:

If you use ggplot2, you are probably used to creating plots with geom_line() and geom_point(). You may also have ventured into to the broader ggplot2 ecosystem to use geoms like geom_density_ridges() from ggridges or geom_signif() from ggsignif. But have you ever wondered how these extensions were created? Where did the authors figure out how to create a new geom? And, if the plot of your dreams doesn’t exist, how would you make your own?

Enter the exciting world of creating your own ggplot2 extensions.

The post looks a lot like a series of slides, and it takes you through the process of creating a new geom. H/T R-Bloggers.

Comments closed