My goal is to do some of the things that I did in my Touching on Advanced Topics post. Originally, I wanted to replicate that analysis in its entirety using Zeppelin, but this proved to be pretty difficult, for reasons that I mention below. As a result, I was only able to do some—but not all—of the anticipated work. I think a more seasoned R / SparkR practitioner could do what I wanted, but that’s not me, at least not today.
With that in mind, let’s start messing around.
SparkR is a bit of a mindset change from traditional R.