SparkR + Zeppelin

I take a look at using SparkR and Zeppelin:

My goal is to do some of the things that I did in my Touching on Advanced Topics post.  Originally, I wanted to replicate that analysis in its entirety using Zeppelin, but this proved to be pretty difficult, for reasons that I mention below.  As a result, I was only able to do some—but not all—of the anticipated work.  I think a more seasoned R / SparkR practitioner could do what I wanted, but that’s not me, at least not today.

With that in mind, let’s start messing around.

SparkR is a bit of a mindset change from traditional R.

Related Posts

From Excel to R: Three Examples

Abdul Majed Raja has a few examples of things which are easy to do in Excel and how you can do them in R: Create a difference variable between the current value and the next valueThis is also known as lead and lag – especially in a time series dataset this varaible becomes very important in feature engineering. In […]

Read More

Calculating AUC in R

Andrew Treadway shows how you can calculate Area Under the Curve in R: AUC is an important metric in machine learning for classification. It is often used as a measure of a model’s performance. In effect, AUC is a measure between 0 and 1 of a model’s performance that rank-orders predictions from a model. For […]

Read More

Categories

July 2016
MTWTFSS
« Jun Aug »
 123
45678910
11121314151617
18192021222324
25262728293031