bin/nutch updatedb

updatedb is an alias for org.apache.nutch.tools.UpdateDatabaseTool

This class takes the output of the fetcher and updates the page and link DBs accordingly. Eventually, as the database scales, this will broken into several phases, each consuming and emitting batch files, but, for now, we're doing it all here.

Usage: bin/nutch org.apache.nutch.tools.UpdateDatabaseTool (-local | -ndfs <namenode:port>) [-max N] [-noAdditions] <db> <seg_dir> [ <seg_dir> ... ]

CommandLineOptions

last edited 2006-01-09 22:51:22 by JerryRussell