bin/nutch inject

inject is an alias for org.apache.nutch.db.!WebDBInjector

This class takes a flat file of URLs and adds them as entries into a web page & link db. Useful for bootstrapping the system.

Usage: bin/nutch org.apache.nutch.db.!WebDBInjector (-local | -ndfs <namenode:port>) <db_dir> (-urlfile <url_file> | -dmozfile <dmoz_file>) [-subset <subsetDenominator>] [-includeAdultMaterial] [-skew skew] [-noDmozDesc] [-topicFile <topic list file>] [-topic <topic> [-topic <topic> [...]]]

CommandLineOptions

last edited 2006-01-09 22:42:54 by JerryRussell