Building A Hadoop Cluster

I have a post on building a five-node Hadoop cluster using Docker containers:

Notice how 3bd shows up for pretty much all of these services.  This is not what you’d want to do in a real production environment, but because we want to use Docker and easily pass ports through, it’s the simplest way for me to set this up.  If you knew beforehand which node would host which service, you could modify the run.sh batch script that we discussed earlier and open those specific ports.

After assigning masters, we next have to define which nodes are clients in which clusters.

Click through for a screenshot-laden walkthrough.

Related Posts

TPC-DS Testing With HDP 3.0

Nita Dembla and Gopal Vijayaraghavan compare HDP 3.0 versus HDP 2.6.5 when running the TPC-DS query set and note performance improvements in Hive LLAP: Hortonworks announced the general availability of HDP 3.0 this year. You may read more about it here. Bundled with HDP 3.0, Apache Hive 3 with LLAP took a significant leap as a […]

Read More

Using Sqoop’s Logic To Improve Spark JDBC Performance

Avi Yehuda analyzes how Sqoop works to make relational database access from Spark faster: Sqoop performed so much better almost instantly, all you needed to do is to set the number of mappers according to the size of the data and it was working perfectly.Since both Spark and Sqoop are based on the Hadoop map-reduce […]

Read More

Categories

November 2016
MTWTFSS
« Oct Dec »
 123456
78910111213
14151617181920
21222324252627
282930