Launching A Sparklyr Cluster

David Smith shows how to launch a sparklyr cluster in Azure:

When you’re finished, shut down your cluster using the aztk spark cluster delete command. (While you can delete the nodes from the Pools view in the Azure portal, the command does some additional cleanup for you.) You’ll be charged for each node in the cluster at the usual VM rates for as long as the cluster is provisioned. (One cost-saving option is to use low-priority VMs for the nodes, for savings of up to 90% compared to the usual rates.)

That’s it! Once you get used to it, it’s all quick and easy — the longest part is waiting for the cluster to spin up in Step 5. This is just a summary, but the full details see the guide SparklyR on Azure with AZTK.

It’ll take a bit more than five minutes to get started, but it is a good sight easier than building the servers yourself.

Related Posts

Databricks Runtime 5.2 Released

Nakul Jamadagni announces Databricks Runtime 5.2: Delta Time TravelTime Travel, released as an Experimental feature, adds the ability to query a snapshot of a table using a timestamp string or a version, using SQL syntax as well as DataFrameReader options for timestamp expressions.Sample codeSELECT count() FROM events TIMESTAMP AS OF timestamp_expressionSELECT count() FROM events VERSION AS OF version Time travel looks a bit like temporal tables in SQL Server.

Read More

Kafka And The Differing Aims Of Data Professionals

Kai Waehner argues that there is an impedence mismatch between data engineers, data scientists, and ML production engineers: Data scientists love Python, period. Therefore, the majority of machine learning/deep learning frameworks focus on Python APIs. Both the stablest and most cutting edge APIs, as well as the majority of examples and tutorials use Python APIs. […]

Read More

Categories

February 2018
MTWTFSS
« Jan Mar »
 1234
567891011
12131415161718
19202122232425
262728