Single-Node Hadoop 3 Installation

Kevin Feasel



Mark Litwintschik has a fairly simple guide for installing Hadoop 3 on a single node for testing:

This post is meant to help people explore Hadoop 3 without feeling the need they should be using 50+ machines to do so. I’ll be using a fresh installation of Ubuntu 16.04.2 LTS on a single computer. The machine has an Intel Core i5-7300HQ CPU clocked at 2.50GHz, 8 GB of RAM and a 200 GB mechanical disk drive. I intentionally picked a low end machine to demonstrate not much is needed to try out Hadoop in a learning exercise.

Please do be mindful these instructions are aimed at building a test environment that is cut off from the outside world. Beyond the fact this is a single machine installation for software which is meant to run on multiple machines there would need to be significant content changes to turn these instructions into production installation notes.

It’s a useful guide if you’re not interested in going with one of the platform vendors like Hortonworks or Cloudera.

Related Posts

Push-Based Alerting With Kafka Streams

Robin Moffatt shows how to take syslog data and create a notification app using Python and Kafka Streams: Now we can query from it and show the aggregate window timestamp alongside the result: ksql> SELECT ROWTIME, TIMESTAMPTOSTRING(ROWTIME, 'yyyy-MM-dd HH:mm:ss'), \ HOST, INVALID_LOGIN_COUNT \ FROM INVALID_USERS_LOGINS_PER_HOST; 1521644100000 | 2018-03-21 14:55:00 | rpi-03 | 1 1521646620000 | […]

Read More

Spark Architecture: The Spark Streaming Receiver

Oleksii Yermolenko gives us an overview of the Receiver object in Spark Streaming: The key component of Spark streaming application is called Receiver. It is responsible for opening new connections with the sources, listening events from them and aggregating incoming data within the memory. If receiver’s worker node is running out of memory, it starts using disk […]

Read More


March 2018
« Feb Apr »