Press "Enter" to skip to content


David Kun introduces the R Database Layer:

It is important to note that the SQL statements generated in the background are not executed unless explicitly requested by the command Hence, you can merge, filter and aggregate your dataset on the database side and load only the result set into memory for R.

In general the design principle behind RDBL is to keep the models as close as possible to the usual data.frame logic, including (as shown later in detail) commands like aggregate, referencing columns by the \($\) operator and features like logical indexing using the \([]\) operator.

Check it out.  I’m not particularly excited about this for one simple reason:  SQL is a better data retrieval and connection DSL than an R-based mapper.  I get the value of sticking to one language as much as possible.  I also get that not all queries need to be well-optimized—for example, you might be running queries on a local machine or against a slice of data which is not connected to an operational production environment.  But I’m a big fan of using the right tool for the job, and the right tool for working with relational databases (and the “relational” part there is perhaps optional) is SQL.