Jupyter On ElasticMapReduce

Tom Zeng shows howt o install Jupyter Notebooks on Amazon’s ElasticMapReduce:

By default (with no --password and --port arguments), Jupyter will run on port 8888 with no password protection; JupyterHub will run on port 8000.  The --port and --jupyterhub-port arguments can be used to override the default ports to avoid conflicts with other applications.

The --r option installs the IRKernel for R. It also installs SparkR and sparklyr for R, so make sure Spark is one of the selected EMR applications to be installed. You’ll need the Spark application if you use the --toree argument.

If you used --jupyterhub, use Linux users to sign in to JupyterHub. (Be sure to create passwords for the Linux users first.)  hadoop, the default admin user for JupyterHub, can be used to set up other users. The –password option sets the password for Jupyter and for the hadoop user for JupyterHub.

Installation is fairly straightforward, and they include a series of samples you can get to try out Jupyter.

Related Posts

Using Azure Blob Storage Archive Tier For Archival Data

Bob Pusateri shows us how to configure Azure Blob Storage Archive Tier: Two of the products I use extensively for this purpose are Amazon Glacier and, more recently, Microsoft Azure Blob Storage Archive Tier. As happy as I’ve been with Amazon Glacier since its introduction in 2012, I always hoped Microsoft would offer a similar service. My wish […]

Read More

Replicating Data In HDFS Between Clusters

Murali Ramasami and Niru Anisetti have an article showing how to use the Hortonworks Data Lifecycle Manager to set up replication between two Hadoop clusters: Data Lifecycle Manager (DLM) delivers on the promise of location-agnostic, secure replication by encapsulating and copying data seamlessly across physical private storage and public cloud environments. This empowers businesses to […]

Read More


December 2016
« Nov Jan »