Launching A Sparklyr Cluster

David Smith shows how to launch a sparklyr cluster in Azure:

When you’re finished, shut down your cluster using the aztk spark cluster delete command. (While you can delete the nodes from the Pools view in the Azure portal, the command does some additional cleanup for you.) You’ll be charged for each node in the cluster at the usual VM rates for as long as the cluster is provisioned. (One cost-saving option is to use low-priority VMs for the nodes, for savings of up to 90% compared to the usual rates.)

That’s it! Once you get used to it, it’s all quick and easy — the longest part is waiting for the cluster to spin up in Step 5. This is just a summary, but the full details see the guide SparklyR on Azure with AZTK.

It’ll take a bit more than five minutes to get started, but it is a good sight easier than building the servers yourself.

Related Posts

WVPlots

Nina Zumel announces a new version of WVPlots on CRAN: WVPlots was originally a catch-all package of ggplot2 visualizations that we at Win-Vector tended to use repeatedly, and wanted to turn into “one-liners.” A consequence of this is that the older visualizations had our preferred color schemes hard-coded in. More recent additions to the package sometimes had palette […]

Read More

RDDs, DataFrames, and Datasets in Spark

Brad Llewellyn walks us through the three key data structures in Apache Spark: We see that creating an RDD can be done with one easy function.  In this snippet, sc represents the default SparkContext.  This is extremely important, but is better left for a later post.  SparkContext offers the .textFile() function which creates an RDD from […]

Read More

Categories

February 2018
MTWTFSS
« Jan Mar »
 1234
567891011
12131415161718
19202122232425
262728