Setting Up SparklyR In Azure

David Smith shows how you can spin up a Spark cluster in Azure and install SparklyR on top of it:

The SparklyR package from RStudio provides a high-level interface to Spark from R. This means you can create R objects that point to data frames stored in the Spark cluster and apply some familiar R paradigms (like dplyr) to the data, all the while leveraging Spark’s distributed architecture without having to worry about memory limitations in R. You can also access the distributed machine-learning algorithms included in Spark directly from R functions.

If you don’t happen to have a cluster of Spark-enabled machines set up in a nearby well-ventilated closet, you can easily set one up in your favorite cloud service. For Azure, one option is to launch a Spark cluster in HDInsight, which also includes the extensions of Microsoft ML Server. While this service recently had a significant price reduction, it’s still more expensive than running a “vanilla” Spark-and-R cluster. If you’d like to take the vanilla route, a new guide details how to set up Spark cluster on Azure for use with SparklyR.

Read on for more details.

Related Posts

Interactive ggplot Plots with plotly

Laura Ellis takes us through ggplotly: As someone very interested in storytelling, ggplot2 is easily my data visualization tool of choice. It is like the Swiss army knife for data visualization. One of my favorite features is the ability to pack a graph chock-full of dimensions. This ability is incredibly handy during the data exploration […]

Read More

Goodbye, gather and spread; Hello pivot_long and pivot_wide

John Mount covers a change in tidyr which mimics Mount and Nina Zumel’s pivot_to_rowrecs and unpivot_to_blocks functions in the cdata package: If you want to work in the above way we suggest giving our cdatapackage a try. We named the functions pivot_to_rowrecs and unpivot_to_blocks. The idea was: by emphasizing the record structure one might eventually internalize what the transforms […]

Read More

Categories

January 2018
MTWTFSS
« Dec Feb »
1234567
891011121314
15161718192021
22232425262728
293031