Differences between revisions 1 and 2
Revision 1 as of 2011-10-24 11:13:37
Size: 383
Comment:
Revision 2 as of 2014-05-29 10:30:31
Size: 718
Editor: JulienNioche
Comment:
Deletions are marked like this. Additions are marked like this.
Line 1: Line 1:
Parsechecker is an alias for org.apache.nutch.indexer.IndexingFiltersChecker indexchecker is an alias for running the class org.apache.nutch.indexer.!IndexingFiltersChecker
Line 3: Line 3:
This tool reads and parses an URL and then runs the indexers on it. Once complete, it displays the fields obtained and the first 100 characters of their value. This tool fetches and parses an URL and then runs the indexing filters on it. Once complete, it displays the fields obtained and the first 100 characters of their value.
Line 13: Line 13:
The parameter ''-D doIndex=true'' can be specified either on the command line or in nutch-site.xml in order to send the document to the indexing backends. Those must be configured accordingly as described in the documentation for the [[https://wiki.apache.org/nutch/bin/nutch%20index|index command]].

indexchecker is an alias for running the class org.apache.nutch.indexer.IndexingFiltersChecker

This tool fetches and parses an URL and then runs the indexing filters on it. Once complete, it displays the fields obtained and the first 100 characters of their value.

Usage:

bin/nutch IndexingFiltersChecker <url>

<url>: The URL you wish to run the indexers on.

The parameter -D doIndex=true can be specified either on the command line or in nutch-site.xml in order to send the document to the indexing backends. Those must be configured accordingly as described in the documentation for the index command.

CommandLineOptions

bin/nutch indexchecker (last edited 2014-05-29 10:30:31 by JulienNioche)