The $3.00 price point was driven by the first method: running a single-node Hadoop installation. I wanted to make sure the dataset used in this benchmark could easily fit into memory.
The price set the limit for the second method: AWS EMR. This is Amazon’s Hadoop Platform offering. It has a huge feature set but the key one is that it lets you setup Hadoop clusters with very little instruction. The $3.00 price limit includes the service fee for EMR.
Note I’ll be running everything in AWS’ Irish region and the prices mentioned are region- and in the case of spot prices, time-specific.
I was a bit surprised at which service won.