Differences between revisions 1 and 2
Revision 1 as of 2006-03-06 23:36:02
Size: 830
Editor: JeffRitchie
Comment: page created
Revision 2 as of 2009-09-20 23:09:48
Size: 830
Editor: localhost
Comment: converted to 1.6 markup
Deletions are marked like this. Additions are marked like this.
Line 8: Line 8:
  '''[-workingdir <workingdir>]:''' Specifies a working directory for the merger located at <workingdir>.[[BR]]
  '''<outputIndex>:''' Path to a directory where the merged index will be created.[[BR]]
  '''<indexesDir>:''' Path to a directory containing indexes to merge. More then one directory may be specified.[[BR]]
  '''[-workingdir <workingdir>]:''' Specifies a working directory for the merger located at <workingdir>.<<BR>>
  '''<outputIndex>:''' Path to a directory where the merged index will be created.<<BR>>
  '''<indexesDir>:''' Path to a directory containing indexes to merge. More then one directory may be specified.<<BR>>
Line 13: Line 13:
 hadoop-default.xml[[BR]]
 hadoop-site.xml[[BR]]
 nutch-default.xml[[BR]]
 nutch-site.xml[[BR]]
 hadoop-default.xml<<BR>>
 hadoop-site.xml<<BR>>
 nutch-default.xml<<BR>>
 nutch-site.xml<<BR>>

"merge" is an alias for "org.apache.nutch.indexer.IndexMerger"

Merges several segment indexes

Usage

  • nutch-0.8-dev/bin/nutch org.apache.nutch.indexer.IndexMerger [-workingdir <workingdir>] <outputIndex> <indexesDir> ...

    • [-workingdir <workingdir>]: Specifies a working directory for the merger located at <workingdir>.
      <outputIndex>: Path to a directory where the merged index will be created.
      <indexesDir>: Path to a directory containing indexes to merge. More then one directory may be specified.

Configuration Files

  • hadoop-default.xml
    hadoop-site.xml
    nutch-default.xml
    nutch-site.xml

Other Files

  • None.

Caveats and Notes

  • index.done file is not created.

DevelopmentCommandLineOptions

nutch-0.8-dev/bin/nutch_merge (last edited 2009-09-20 23:09:48 by localhost)