Differences between revisions 1 and 2
Revision 1 as of 2006-03-04 21:47:46
Size: 1371
Editor: 24-241-218-184
Comment: New Page - Removed Options no longer in nutch script.
Revision 2 as of 2006-05-10 16:29:30
Size: 1792
Editor: wg-ro-lvlc
Comment:
Deletions are marked like this. Additions are marked like this.
Line 6: Line 6:
||["nutch-0.8-dev/bin/nutch crawl"]||One-step crawler for intranets||
||["nutch-0.8-dev/bin/nutch readdb"]||Read / dump crawldb||
||["nutch-0.8-dev/bin/nutch readlinkdb"]||Read / dump linkdb||
||["nutch-0.8-dev/bin/nutch inject"]||Inject new urls into the crawldb||
||["nutch-0.8-dev/bin/nutch generate"]||Generate new segments to fetch||
||["nutch-0.8-dev/bin/nutch fetch"]||Fetch a segment's pages||
||["nutch-0.8-dev/bin/nutch parse"]||Parse contents in one segment||
||["nutch-0.8-dev/bin/nutch segread"]||Read data in an existing segment||
||["nutch-0.8-dev/bin/nutch updatedb"]||Updates the crawldb from a segment||
||["nutch-0.8-dev/bin/nutch invertlinks"]||Create or update a linkdb from a segment or segments||
||["nutch-0.8-dev/bin/nutch index"]||Run the indexer on a segment's fetcher output||
||["nutch-0.8-dev/bin/nutch merge"]||Merge several segment indexes||
||["nutch-0.8-dev/bin/nutch dedup"]||Deletes duplicate documents in a set of segment indexes||
||["nutch-0.8-dev/bin/nutch plugin"]||Load a plugin and run one of its classes main()||
||["nutch-0.8-dev/bin/nutch server"]||Run a search server||
||["nutch-0.8-dev/bin/nutch crawl"]||One-step crawler for intranets.||
||["nutch-0.8-dev/bin/nutch readdb"]||Read / dump crawldb.||
||["nutch-0.8-dev/bin/nutch readlinkdb"]||Read / dump linkdb.||
||["nutch-0.8-dev/bin/nutch inject"]||Inject new urls into the crawldb.||
||["nutch-0.8-dev/bin/nutch generate"]||Generate new segments to fetch.||
||["nutch-0.8-dev/bin/nutch fetch"]||Fetch a segment's pages.||
||["nutch-0.8-dev/bin/nutch parse"]||Parse contents in one segment.||
||["nutch-0.8-dev/bin/nutch segread"]||Read data in an existing segment.||
||["nutch-0.8-dev/bin/nutch updatedb"]||Updates the crawldb from a segment.||
||["nutch-0.8-dev/bin/nutch invertlinks"]||Create or update a linkdb from a segment or segments.||
||["nutch-0.8-dev/bin/nutch index"]||Run the indexer on a segment's fetcher output.||
||["nutch-0.8-dev/bin/nutch merge"]||Merge several segment indexes.||
||["nutch-0.8-dev/bin/nutch mergedb"]||Merge several crawldb-s together. Can be used for filtering out specific content.||
||["nutch-0.8-dev/bin/nutch mergelinkdb"]||Merge several linkdb-s together. Can be used for filtering out specific content.||
||["nutch-0.8-dev/bin/nutch mergesegs"]||Merge several input segments into one or more output segments. Can be used for filtering out specific content.
||
||["nutch-0.8-dev/bin/nutch dedup"]||Deletes duplicate documents in a set of segment indexes.||
||["nutch-0.8-dev/bin/nutch plugin"]||Load a plugin and run one of its classes main().||
||["nutch-0.8-dev/bin/nutch server"]||Run a search server.||

Command Line Options of nutch-0.8-dev/bin/nutch

See each entry for datails of the command arguments and options.

command

function

["nutch-0.8-dev/bin/nutch crawl"]

One-step crawler for intranets.

["nutch-0.8-dev/bin/nutch readdb"]

Read / dump crawldb.

["nutch-0.8-dev/bin/nutch readlinkdb"]

Read / dump linkdb.

["nutch-0.8-dev/bin/nutch inject"]

Inject new urls into the crawldb.

["nutch-0.8-dev/bin/nutch generate"]

Generate new segments to fetch.

["nutch-0.8-dev/bin/nutch fetch"]

Fetch a segment's pages.

["nutch-0.8-dev/bin/nutch parse"]

Parse contents in one segment.

["nutch-0.8-dev/bin/nutch segread"]

Read data in an existing segment.

["nutch-0.8-dev/bin/nutch updatedb"]

Updates the crawldb from a segment.

["nutch-0.8-dev/bin/nutch invertlinks"]

Create or update a linkdb from a segment or segments.

["nutch-0.8-dev/bin/nutch index"]

Run the indexer on a segment's fetcher output.

["nutch-0.8-dev/bin/nutch merge"]

Merge several segment indexes.

["nutch-0.8-dev/bin/nutch mergedb"]

Merge several crawldb-s together. Can be used for filtering out specific content.

["nutch-0.8-dev/bin/nutch mergelinkdb"]

Merge several linkdb-s together. Can be used for filtering out specific content.

["nutch-0.8-dev/bin/nutch mergesegs"]

Merge several input segments into one or more output segments. Can be used for filtering out specific content.

["nutch-0.8-dev/bin/nutch dedup"]

Deletes duplicate documents in a set of segment indexes.

["nutch-0.8-dev/bin/nutch plugin"]

Load a plugin and run one of its classes main().

["nutch-0.8-dev/bin/nutch server"]

Run a search server.

08CommandLineOptions (last edited 2011-07-18 14:56:29 by LewisJohnMcgibbney)