Apache Hadoop HDFS Commands

8 June 2017

This post describes some of the basic Apache Hadoop HDFS commands one would need when working in a Hadoop Cluster.

1.Create a directory in HDFS at given path(s)

$hadoop fs -mkdir <paths>  //Syntax
$hadoop fs -mkdir /home/hduser/dir1 /home/hduser/dir2

2.List the contents of a directory in HDFS.

$hadoop fs -ls /home/hduser
//Recursive command to list all directory, sub directory of hadoop HDFS till the end.

~$hadoop fs- lsr

3.Upload a file in HDFS from Local.

//Copy single src file, or multiple src files from local file system to the Hadoop data file system

$hadoop fs -put <local file system source> ... <HDFS_dest_Path> //Syntax

$hadoop fs -put /home/hduser/HadoopJob/input/74-0.txt /user/hduser/input

///With Relative Path(Use "./"  for relative Path)
$hadoop fs -put ./HadoopJob/input/accesslogs.log /user/hduser/input

4.Download a file in local File System from HDFS

$hadoop fs -get <hdfs_source> <localdestination>  //Syntax
$hadoop fs -get /home/hduser/dir3/file1/txt /home/

5.See or Read contents of a file

$hadoop fs -cat /home/hduser/dir1/abc.txt

6.Copy a file from source to destination

//allows multiple sources{File or Directory} as well in which case the destination must be a directory.

$hadoop fs -copyFromLocal <localsrc> URI   //Syntax
$hadoop fs -copyFromLocal /home/hduser/abc.txt  /home/hduser/abc.txt

7.Move file from source to destination.

$hadoop fs -mv <src> <dest>   //Syntax
$hadoop fs -mv /home/hduser/dir2/abc.txt /home/hduser/dir2

8.a.Remove a file or directory in HDFS.

Remove files specified as argument. Deletes directory only when it is empty

Usage :
$hadoop fs -rm <argument>
$hadoop fs -rm /home/hduser/dir1/abc.txt

8.b.Recursive version of delete.

Usage :
$hadoop fs -rmr <arg>
$hadoop fs -rmr /home/hduser/

9.Display last few lines of a file.

Similar to tail command in Unix.

$hadoop fs -tail /home/hduser/dir1/abc.txt

10.Display the aggregate length or disk usage of a file or HDFS path

hadoop fs -du /home/hduser/dir1/abc.txt

//Disk usage of hdfs directory
~$hadoop fs -du /<Directory Path>

11.Counts the no of directories,files and bytes in a File Path

~$hadoop fs -count <Filepath>  :

12.Empty the Trash

~$hadoop fs -expunge :Empty the trash 

13.Takes a source directory and destination file as input and concatenates file in src into destination local file

~$hadoop fs -getmerge <HDFS source path>
             <Local file system Destination path >

14.Takes a source file and outputs the file in text format.

 ~$hadoop fs -text <Source Path>
 The allowed formats are zip & TextReadInput Stream

15.creates a file of length Zero or size

~$hadoop fs -touchz <path>

16.Check if the File ,path or Directory Exists

~$hadoop fs -test -ezd <pathname>
  hadoop fs -test -e <path>
  hadoop fs -test -z <pathname>
  hadoop fs -test -d <pathname>
 -e:check to see if the file exists 
     return 0 if true
  -z:check to see if  the file is zero length 
      return if true
   -d:Checks and return 1 if path is directory 
      else 0   

17.Returns the stat information on path

$hadoop fs -stat <local or HDFS path name>

18.Displaying Disk file system capability in terms of bytes

~$hadoop fs -df <Directory Path>

19.Check Applications Logs using Application ID

#Check all Logs
~$yarn logs -applicationId <Application ID>

To View a Specific Log Type for a Running Application use below command

~$yarn logs -applicationId <Application ID> -log_files <log_file_type>

To view only the stderr error logs

~$yarn logs -applicationId <Application ID> -log_files stderr

The -logFiles option also supports Java regular expressions, so the following format would return all types of log files:

~$yarn logs -applicationId <Application ID> -log_files .* 

Disable the NameNode Safe mode

Below command is used to disable the safe node of namenode and can be executed by only Hadoop Admin or Hadoop operation team.

sudo su hdfs -l -c 'hdfs dfsadmin -safemode leave'


Yarn CLI Commands

Share: Twitter Facebook Google+ LinkedIn
comments powered by Disqus