Microsoft R Server On Spark

Kevin Feasel

2016-08-10

Cloud, R, Spark

Max Kaznady, et al, discuss using Microsoft R Server on Spark to perform rapid prototyping against the NYC Taxi dataset:

Once the cluster is created, you can connect to the edge node where MRS is already pre-installed by SSHing to r-server.YOURCLUSTERNAME-ssh.azurehdinsight.net with the credentials which you supplied during the cluster creation process. In order to do this in MobaXterm, you can go to Sessions, then New Sessions and then SSH.

The default installation of HDI Spark on Linux cluster does not come with RStudio Server installed on the edge node. RStudio Server is a popular open source integrated development environment (IDE) available for R that provides a browser-based IDE for use by remote clients. This tool allows you to benefit from all the power of R, Spark and Microsoft HDInsight cluster through your browser. In order to install RStudio you can follow the steps detailed in the guide, which reduces to running a script on the edge node.

If you’ve been meaning to get further into Spark & R, this is a great article to follow along with on your own.

Related Posts

Sizing Azure SQL Database

Arun Sirpal takes us through finding the right size for Azure SQL Database: Do you want to identify the correct Service Tier and Compute Size ( was once known as performance level) for your Azure SQL Database? How would you go about it? Would you use the DTU (Database Transaction Unit) calculator? What about the […]

Read More

Cleaning Up After Yourself in Azure Data Factory

Rayis Imayev shows how you can automatically delete old files in Azure Data Factory: File management may not be at the top of my list of priorities during data integration projects. I assume that once I learn enough about sourcing data systems and target destination platform, I’m ready to design and build a data integration […]

Read More

Categories

August 2016
MTWTFSS
« Jul Sep »
1234567
891011121314
15161718192021
22232425262728
293031