Comparing Hadoop On EC2 To EMR

Mark Litwintschik looks at EC2 versus EMR in terms of performance, specifically targeting a solution at less than $3 per hour:

The $3.00 price point was driven by the first method: running a single-node Hadoop installation. I wanted to make sure the dataset used in this benchmark could easily fit into memory.

The price set the limit for the second method: AWS EMR. This is Amazon’s Hadoop Platform offering. It has a huge feature set but the key one is that it lets you setup Hadoop clusters with very little instruction. The $3.00 price limit includes the service fee for EMR.

Note I’ll be running everything in AWS’ Irish region and the prices mentioned are region- and in the case of spot prices, time-specific.

I was a bit surprised at which service won.

Related Posts

Building TensorFlow Neural Networks On Spark With Keras

Jules Damji has an example of using the PyCharm IDE to use Keras to build TensorFlow neural network models on the Databricks MLflow library: Our example in the video is a simple Keras network, modified from Keras Model Examples, that creates a simple multi-layer binary classification model with a couple of hidden and dropout layers and […]

Read More

Creating SQL Server Images In Azure Container Registry

Andrew Pruski shows us how to save Docker container images to the Azure Container Registry using Powershell: Awesome! Our custom image is in our ACR! But has it worked? Has it really? Oh ye of little faith… I guess the only way to find out is to run a container! So let’s run a Azure […]

Read More

Categories

March 2018
MTWTFSS
« Feb Apr »
 1234
567891011
12131415161718
19202122232425
262728293031