Invertlinks is an alias for org.apache.nutch.crawl.LinkDb
This class maintains an inverted link map, listing incoming links for each url. Public class LinkDb extends Configured implements Tool, Mapper<Text, ParseData, Text, Inlinks>
Usage:
bin/nutch invertlinks <linkdb> (-dir <segmentsDir> | <seg1> <seg2> ...) [-force] [-noNormalize] [-noFilter]
<linkdb>: This should be the path the the output linkdb to create or update.
-dir <segmentsDir>: This corresponds to the parent directory containing several segments, OR
<seg1> <seg2> ...: A list of segment directories to create a inverted linkdb from.
[-force]: This arguement forces an update even if linkdb appears to be locked :(CAUTION advised:
[-noNormalize]: We pass this if we don't normalize link URLs. This obtains us a true representation of incoming links within the linkdb.
[-noFilter]: This parameter avoids and doesn't apply any of our current URLFilters to link URLs.