HDFS Cheat Sheet

Tim Spann has produced a guide for HDFS shell commands:

There are number of commands that you may need to use for administrating your cluster if you are one of the administrators for your cluster.   If you are running your own personal cluster or Sandbox, these are also good to know and try. Do Not Try These In Production if you are not the owner and fully understand the dire consequences of these actions.   These commands will be affecting the entire Hadoop cluster distributed file system.  You can shutdown data nodes, add quotas to directories for various users and other administrative features.

Many of the commands in this list should be familiar if you know much about Linux or Unix administration, but there are some Hadoop-specific commands as well, like moveFromLocal and moveToLocal.

Related Posts

Kafka 2.3 and Kafka Connect Improvements

Robin Moffatt goes over improvements in Kafka Connect with the release of Apache Kafka 2.3: A Kafka Connect cluster is made up of one or more worker processes, and the cluster distributes the work of connectors as tasks. When a connector or worker is added or removed, Kafka Connect will attempt to rebalance these tasks. Before version 2.3 of Kafka, […]

Read More

The Databricks File System

Brad Llewellyn takes us through the Azure Databricks File System: Today, we’re going to talk about the Databricks File System (DBFS) in Azure Databricks.  If you haven’t read the previous posts in this series, Introduction, Cluster Creation and Notebooks, they may provide some useful context.  You can find the files from this post in our GitHub Repository.  Let’s move on […]

Read More

Categories

November 2016
MTWTFSS
« Oct Dec »
 123456
78910111213
14151617181920
21222324252627
282930