Differences between revisions 1 and 2
Revision 1 as of 2009-07-29 15:08:17
Size: 1645
Editor: AlexMc
Comment: historical version of the generic command line page
Revision 2 as of 2009-09-20 23:09:32
Size: 1645
Editor: localhost
Comment: converted to 1.6 markup
Deletions are marked like this. Additions are marked like this.
Line 6: Line 6:
||["bin/nutch admin"]||Web page and link database administration, including creation||
||["bin/nutch analyze"]||Adjust database link-analysis scoring||
||["bin/nutch crawl"]||Perform complete crawling and indexing of a set of root urls||
||["bin/nutch datanode"]||NDFS data node||
||["bin/nutch dedup"]||Deletes duplicate documents in a set of segment indexes||
||["bin/nutch fetch"]||Fetch a segment's pages||
||["bin/nutch fetchlist"]||Print the fetchlist of a segment||
||["bin/nutch generate"]||Generate new segments to fetch||
||["bin/nutch index"]||Run the indexer on a segment's fetcher output||
||["bin/nutch inject"]||Inject new urls into the web page and link database||
||["bin/nutch merge"]||Merge several segment indexes||
||["bin/nutch mergesegs"]||Merges multiple segments & removes duplicates||
||["bin/nutch namenode"]||NDFS name node||
||["bin/nutch ndfs"]||NDFS administrative access||
||["bin/nutch parse"]||Parse contents in one segment||
||["bin/nutch prune"]||Prunes existing Nutch indexes of unwanted content||
||["bin/nutch readdb"]||Read data from the web page and link db||
||["bin/nutch segread"]||Read data in an existing segment||
||["bin/nutch segslice"]||Divide data from one segement into several segments||
||["bin/nutch server"]||Run a search server of IPC connections||
||["bin/nutch updatedb"]||Updates the web page and link db from the segment fetcher output||
||[[bin/nutch_admin]]||Web page and link database administration, including creation||
||[[bin/nutch_analyze]]||Adjust database link-analysis scoring||
||[[bin/nutch_crawl]]||Perform complete crawling and indexing of a set of root urls||
||[[bin/nutch_datanode]]||NDFS data node||
||[[bin/nutch_dedup]]||Deletes duplicate documents in a set of segment indexes||
||[[bin/nutch_fetch]]||Fetch a segment's pages||
||[[bin/nutch_fetchlist]]||Print the fetchlist of a segment||
||[[bin/nutch_generate]]||Generate new segments to fetch||
||[[bin/nutch_index]]||Run the indexer on a segment's fetcher output||
||[[bin/nutch_inject]]||Inject new urls into the web page and link database||
||[[bin/nutch_merge]]||Merge several segment indexes||
||[[bin/nutch_mergesegs]]||Merges multiple segments & removes duplicates||
||[[bin/nutch_namenode]]||NDFS name node||
||[[bin/nutch_ndfs]]||NDFS administrative access||
||[[bin/nutch_parse]]||Parse contents in one segment||
||[[bin/nutch_prune]]||Prunes existing Nutch indexes of unwanted content||
||[[bin/nutch_readdb]]||Read data from the web page and link db||
||[[bin/nutch_segread]]||Read data in an existing segment||
||[[bin/nutch_segslice]]||Divide data from one segement into several segments||
||[[bin/nutch_server]]||Run a search server of IPC connections||
||[[bin/nutch_updatedb]]||Updates the web page and link db from the segment fetcher output||

Command Line Options of bin/nutch

See each entry for datails of the command arguments and options.

command

function

bin/nutch_admin

Web page and link database administration, including creation

bin/nutch_analyze

Adjust database link-analysis scoring

bin/nutch_crawl

Perform complete crawling and indexing of a set of root urls

bin/nutch_datanode

NDFS data node

bin/nutch_dedup

Deletes duplicate documents in a set of segment indexes

bin/nutch_fetch

Fetch a segment's pages

bin/nutch_fetchlist

Print the fetchlist of a segment

bin/nutch_generate

Generate new segments to fetch

bin/nutch_index

Run the indexer on a segment's fetcher output

bin/nutch_inject

Inject new urls into the web page and link database

bin/nutch_merge

Merge several segment indexes

bin/nutch_mergesegs

Merges multiple segments & removes duplicates

bin/nutch_namenode

NDFS name node

bin/nutch_ndfs

NDFS administrative access

bin/nutch_parse

Parse contents in one segment

bin/nutch_prune

Prunes existing Nutch indexes of unwanted content

bin/nutch_readdb

Read data from the web page and link db

bin/nutch_segread

Read data in an existing segment

bin/nutch_segslice

Divide data from one segement into several segments

bin/nutch_server

Run a search server of IPC connections

bin/nutch_updatedb

Updates the web page and link db from the segment fetcher output

07CommandLineOptions (last edited 2009-09-20 23:09:32 by localhost)