Building A Hadoop Cluster

I have a post on building a five-node Hadoop cluster using Docker containers:

Notice how 3bd shows up for pretty much all of these services.  This is not what you’d want to do in a real production environment, but because we want to use Docker and easily pass ports through, it’s the simplest way for me to set this up.  If you knew beforehand which node would host which service, you could modify the run.sh batch script that we discussed earlier and open those specific ports.

After assigning masters, we next have to define which nodes are clients in which clusters.

Click through for a screenshot-laden walkthrough.

Related Posts

The Banality of Containers

Grant Fritchey points out that containers hosting SQL Server are, in and of themselves, banal: I’m only half joking when I say containers are boring and dull. They’re not. The technology is amazing. However, that technology doesn’t fundamentally change what you’re dealing with. It’s SQL Server. How do you capture detailed performance metrics in a […]

Read More

When Not to Use Spark

Ramandeep Kaur gives us several cases when it makes sense not to use Apache Spark: There can be use cases where Spark would be the inevitable choice. Spark considered being an excellent tool for use cases like ETL of a large amount of a dataset, analyzing a large set of data files, Machine learning, and […]

Read More

Categories

November 2016
MTWTFSS
« Oct Dec »
 123456
78910111213
14151617181920
21222324252627
282930