Spark Clusters On Spot Pricing

Sameer Farooqui explains spot pricing with respect to AWS servers:

The idea behind Spot instances is to allow you to bid on spare Amazon EC2 compute capacity. You choose the max price you’re willing to pay per EC2 instance hour. If your bid meets or exceeds the Spot market price, you win the Spot instances. However, unlike traditional bidding, when your Spot instances start running, you pay the live Spot market price (not your bid amount). Spot prices fluctuate based on the supply and demand of available EC2 compute capacity and are specific to different regions and availability zones.

So, although you may have bid 0.55 cents per hour for a r3.2xlarge instance, you’ll end up paying only 0.10 cents an hour if that’s what the going rate is for the region and availability zone.

Databricks uses spot pricing for Community Edition clusters to control costs.  Click through for a very interesting discussion of spot pricing and how they take advantage of it.

Related Posts

Recommendations For Storage On Azure SQL DB Managed Instances

Dimitri Furman has some thoughts on database storage architecture for Azure SQL Database Managed Instances: MI GP uses Azure Premium Storage to store database files for all databases, except for the tempdb database. From the perspective of the database engine, this storage type is remote, i.e. it is accessed over the network, using Azure network infrastructure. To […]

Read More

Building TensorFlow Neural Networks On Spark With Keras

Jules Damji has an example of using the PyCharm IDE to use Keras to build TensorFlow neural network models on the Databricks MLflow library: Our example in the video is a simple Keras network, modified from Keras Model Examples, that creates a simple multi-layer binary classification model with a couple of hidden and dropout layers and […]

Read More

Categories

November 2016
MTWTFSS
« Oct Dec »
 123456
78910111213
14151617181920
21222324252627
282930