In This Space
The page you were looking for has a similar name to the following pages:
-
bin/nutch dedup
(NUTCH)
The dedup command is an alias for the class org.apache.nutch.crawl.DeduplicationJob and is available since Nutch 1.8 (Not yet in Nutch 2.x as of May 2014)....
-
bin/nutch fetch
(NUTCH)
Fetch is an alias for org.apache.nutch.fetcher.Fetcher This fetcher uses a well-known model of one producer (a QueueFeeder https://wiki.apache.org/nutch/QueueFeeder) and many consumers (FetcherThread https://wiki.apache.org/nutch/FetcherThread-s)....
-
bin/nutch index
(NUTCH)
Pluggable Indexing The index command (running org.apache.nutch.indexer.IndexingJob) takes the content from one or multiple segments and passes it to all enabled IndexWriter https://wiki.apache....
-
bin/nutch junit
(NUTCH)
JUnit is an alias for junit.textui.TestRunner https://wiki.apache.org/nutch/TestRunner This class takes one JUnit test class and runs it from the command line. Usage: bin/nutch junit <testClass> <testClass>: This is the test class you wish to execute e.g org.</testClass></testClass>...
-
bin/nutch parse
(NUTCH)
Parse is an alias for org.apache.nutch.parse.ParseSegment https://wiki.apache.org/nutch/ParseSegment Nutch 1.x The class parses contents in one segment. It assumes, under the given segment, the existence of ./fetcher_output/,...