Differences between revisions 3 and 4
Revision 3 as of 2006-10-10 22:04:20
Size: 710
Editor: JacobBrunson
Comment: Changed invertlinks page to reflect true commandline options
Revision 4 as of 2009-09-20 23:10:03
Size: 711
Editor: localhost
Comment: converted to 1.6 markup
Deletions are marked like this. Additions are marked like this.
Line 8: Line 8:
  '''<linkdb>:''' Path to the link database.[[BR]]
  '''<segment>:''' Path to the segment that has been fetched. A directory or more than one segment may be specified.[[BR]]
  '''<linkdb>:''' Path to the link database.<<BR>>
  '''<segment>:''' Path to the segment that has been fetched. A directory or more than one segment may be specified.<<BR>>
Line 12: Line 12:
 hadoop-default.xml[[BR]]
 hadoop-site.xml[[BR]]
 nutch-default.xml[[BR]]
 nutch-site.xml[[BR]]
 hadoop-default.xml<<BR>>
 hadoop-site.xml<<BR>>
 nutch-default.xml<<BR>>
 nutch-site.xml<<BR>>
Line 23: Line 23:
[:08CommandLineOptions:Commandline] options for version 0.8 [[08CommandLineOptions|Commandline]] options for version 0.8

"invertlinks" is an alias for "org.apache.nutch.crawl.LinkDb"

Usage

  • nutch-0.8-dev/bin/nutch org.apache.nutch.crawl.LinkDb <linkdb> (-dir segmentsDir | segment1 segment2 ...)

    • <linkdb>: Path to the link database.
      <segment>: Path to the segment that has been fetched. A directory or more than one segment may be specified.

Configuration Files

  • hadoop-default.xml
    hadoop-site.xml
    nutch-default.xml
    nutch-site.xml

Other Files

  • None.

Caveats and Notes

  • None.

Commandline options for version 0.8

nutch-0.8-dev/bin/nutch_invertlinks (last edited 2009-09-20 23:10:03 by localhost)