Differences between revisions 7 and 8
Revision 7 as of 2007-03-02 05:20:11
Size: 4434
Editor: NigelDaley
Comment: add code font
Revision 8 as of 2009-09-20 23:54:10
Size: 4434
Editor: localhost
Comment: converted to 1.6 markup
No differences found!

hadoop dfs is the command used to execute dfs commands. The full syntax is

hadoop dfs -fs [local | <namenode:port>] [-conf <configuration file>] [-D <property=value>] [-ls <path>] [-lsr <path>] [-du <path>] [-dus <path>] [-mv <src> <dst>] [-cp <src> <dst>] [-rm <src>] [-rmr <src>] [-put <localsrc> <dst>] [-copyFromLocal <localsrc> <dst>] [-moveFromLocal <localsrc> <dst>] [-get <src> <localdst>] [-getmerge <src> <localdst> [addnl]] [-cat <src>] [-copyToLocal <src><localdst>] [-moveToLocal <src> <localdst>] [-mkdir <path>] [-setrep [-R] <rep> <path/file>] [-help [cmd]]

-fs [local | <namenode:port>]: if not specified, the current configuration is used, taken from the following, in increasing precedence:

  • hadoop-default.xml inside the hadoop jar file
  • hadoop-default.xml in $HADOOP_CONF_DIR
  • hadoop-site.xml in $HADOOP_CONF_DIR

local means use the local file system as your DFS, <namenode:port> specifies a particular name node to contact. this argument is optional but if used must appear first on the command line.

Exactly one additional argument must be specified.

A word about paths: a path may be relative or absolute. An absolute path starts with a '/', a relative path does not, and always relates to /user/<currentUser>. There is no notion of current working directory.

-ls <path>: List the contents that match the specified file pattern. If path is not specified, the contents of /user/<currentUser> will be listed. The output contains one line of the form

  • Found n items

followed by one line per directory and one line per file. Directory entries are of the form

  • dirName (full path)   <dir>

and file entries are of the form

  • fileName(full path)   <r n>   size

where n is the number of replicas specified for the file and size is the size of the file, in bytes.

-lsr <path>: Recursively list the contents that match the specified file pattern. Behaves very similarly to hadoop dfs -ls, except that the first line (Found n items) is omitted, and data is shown for all the entries in the subtree.

-du <path>: Show the amount of space, in bytes, used by the files that match the specified file pattern. Equivalent to the Unix command du -sb <path>/* in case of a directory, and to du -b <path> in case of a file. The output is in the form

  • name(full path)  size (in bytes)

-dus <path>: Show the amount of space, in bytes, used by the files that match the specified file pattern. Equivalent to the Unix command du -sb <path>. The output is in the form

  • name (full path) size (in bytes)

-mv <src> <dst>: Move the files that match the specified file pattern <src> to a destination <dst>. When moving multiple files, the destination must be a directory.

-cp <src> <dst>: Copy files that match the file pattern <src> to a destination. When copying multiple files, the destination must be a directory. Equivalent to the Unix command cp -rf <src> <dst>.

-rm <src>: Remove all files that match the specified file pattern. Equivalent to the Unix command rm <src>.

-rmr <src>: Remove all directories which match the specified file pattern. Equivlent to the Unix command rm -rf <src>.

-put <localsrc> <dst>: Copy a single file from the local file system into dfs.

-copyFromLocal <localsrc> <dst>: Identical to the -put command.

-moveFromLocal <localsrc> <dst>: Same as -put, except that the source is deleted after it is copied.

-get <src> <localdst>: Copy files that match the file pattern <src> to the local destination. <src> is kept. When copying mutiple files, the destination must be a directory.

-cat <src>: Fetch all files that match the file pattern <src> and display their content on stdout.

-copyToLocal <src><localdst>: Identical to the -get command.

-moveToLocal <src> <localdst>: Not implemented yet.

-mkdir <path>: Create a directory in specified location.

-setrep [-R] <rep> <path/file>: Set the replication level of a file. The -R flag requests a recursive change of replication level for an entire directory tree.

-help [cmd]: Display help for given command or all commands if no command is specified.

hadoop-0.1-dev/bin/hadoop_dfs (last edited 2009-09-20 23:54:10 by localhost)