Pluggable Indexing

The index command (running org.apache.nutch.indexer.IndexingJob) takes the content from one or multiple segments and passes it to all enabled IndexWriter plugins which send the documents to Solr, Elasticsearch, and various other index back-ends.

Nutch 1.x

Usage: bin/nutch index <crawldb> [-linkdb <linkdb>] [-params k1=v1&k2=v2...] (<segment> ... | -dir <segments>) [-noCommit] [-deleteGone] [-filter] [-normalize] [-addBinaryContent] [-base64]

Indexwriter plugins have to be enabled by the property plugin.includes. See IndexWriter how to configure these plugins.

bin/nutch index (last edited 2018-07-26 12:25:09 by SebastianNagel)