WenSui Liu has a script to join tables together in SparkR:
# INNER JOIN
showDF
(
merge
(sum1, sum2, by.x =
"month1"
, by.y =
"month2"
, all =
FALSE
))
showDF
(
join
(sum1, sum2, sum1$month1 == sum2$month2,
"inner"
))
#+------+-------+------+-------+
#|month1|min_dep|month2|max_dep|
#+------+-------+------+-------+
#| 3| -25| 3| 911|
#| 2| -33| 2| 853|
#+------+-------+------+-------+
There’s no commentary, so it’s all script all the time. H/T R-bloggers