Multiple R Studio Users On HDInsight

Kevin Feasel

2017-05-18

Hadoop, R

Xiaoyong Zhu shows how to set up additional R Studio users in an HDInsight cluster:

Basically speaking, the “http user” will be used to authenticate through the HDInsight gateway, which is used to protect the HDInsight clusters you created. This user is used to access the Ambari UI, YARN UI, as well as many other UI components.

The “ssh user” will be used to access the cluster through secure shell. This user is actually a user in the Linux system in all the head nodes, worker nodes, edge nodes, etc., so you can use secure shell to access the remote clusters.

For Microsoft R Server on HDInsight type cluster, it’s a bit more complex, because we put R Studio Server Community version in HDInsight, which only accepts Linux user name and password as login mechanisms (it does not support passing tokens), so if you have created a new cluster and want to use R Studio, you need to first login using the http user’s credential and login through the HDInsight Gateway, and then use the ssh user’s credential to login to RStudio.

It’s a good read and also includes a sample Spark-R job.

Related Posts

Timing R Function Calls

Colin Gillespie shows off an R package for benchmarking: Of course, it’s more likely that you’ll want to compare more than two things. You can compare as many function calls as you want with mark(), as we’ll demonstrate in the following example. It’s probably more likely that you’ll want to compare these function calls against more […]

Read More

From pandas to Spark with koalas

Achilleus tries out Koalas: Python is widely used programming language when it comes to Data science workloads and Python has way too many different libraries to back this fact. Most of the data scientists are familiar with Python and pandas mostly. But the main issue with Pandas is it works great for small and medium […]

Read More

Categories