rquery: Relational Algebra In R

Kevin Feasel



John Mount announces rquery:

rquery is Win-Vector LLC‘s currently in development big data query tool for R.

rquery supplies set of operators inspired by Edgar F. Codd‘s relational algebra (updated to reflect lessons learned from working with RSQL, and dplyr at big data scale in production).

As an example: rquery operators allow us to write our earlier “treatment and control” example as follows.

dQ <- d %.>% extend_se(., if_else_block( testexpr = "rand()>=0.5", thenexprs = qae( a_1 := 'treatment', a_2 := 'control'), elseexprs = qae( a_1 := 'control', a_2 := 'treatment'))) %.>% select_columns(., c("rowNum", "a_1", "a_2"))

It’s an interesting idea.

Related Posts

Beware Multi-Assignment dplyr::mutate() Statements

John Mount hits on an issue when using dplyr backed by a database in R: Notice the above gives an incorrect result: all of the x_i columns are identical, and all of the y_i columns are identical. I am not saying the above code is in any way desirable (though something like it does arise naturally in certain test […]

Read More

The Theory Behind cdata

John Mount has a video explaining the concepts behind cdata: We also have two really nifty articles on the theory and methods: Fluid data reshaping with cdata Coordinatized Data: A Fluid Data Specification Please give it a try! Click through for the video, which I found very helpful in tying together a number of data […]

Read More


December 2017
« Nov Jan »