Starting, Interacting and Examples of HDFS

This is a follow up on the previous post:

Starting Hadoop Distributed File System

Now we will format the file system we just configured. Important: This process should only be performed once

bin/hadoop namenode -format

Now we can start the the HDFS


The will start the NameNode server on the master machine and also starts the DataNode on the slave machines. Note: in a single instance the NameNode and DataNode will have the same name. In a clustered HDFS you will have to ssh into each slave and start the DataNode.

Interacting with Hadoop Distributed File System

The bulk of commands that communicate with the cluster are performed by a monolithic script named bin/hadoop. This will load the Hadoop system with the Java virtual machine and execute a user command. The commands are specified in the following form:

bin/hadoop moduleName -cmd args...

moduleName : subset of Hadoop to use

cmd : module to execute

Exaples of Hadoop Distributed File System

Listing file :

 bin/hadoop dfs -ls /

Insert Data into the Cluster (3 steps):
Step 1. Create hadoop user

bin/hadoop dfs -mkdir /user/username

Step 2: Put file to cluster

bin/hadoop dfs -put /home/username/DataSet.txt /user/username/

Step 3: Verify the file is in HDFS

dfs -ls /user/yourUserName

Uploading multiple files at once (specify directory to upload):

bin/hadoop -put /myfiles /user/username

Note: Another synonym for -put is -copyFromLocal.

Display Files in HDFS:

bin/hadoop dfs -cat file

(will display in console)

Copy File from HDFS:

bin/hadoop dfs -get file localfile

1 Comment

  1. Viju says:

    How can we interact with HDFS from an external node (Not a datanode not a Namenode)

Leave a Comment

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s