Programmatically Reading from DFS

Occasionally it become desirable to read an extended amount on data from DFS to configure a mapper/reducer. This can be done by reading a sequenceFile from the DFS during the configure() call to the mapper/reducer.

FileSystem fs = FileSystem.get(conf);
Path path = new Path("/user/jhebert/out/part-00000");
SequenceFile.Reader reader = new SequenceFile.Reader(fs, path, conf);

Text key = new Text();
Text value = new Text();
while (, value)) {

    // Do something useful with the data.


HadoopMapReduceSecondaryData (last edited 2009-09-20 23:55:09 by localhost)