This diagram depicts most of the dependencies among the Nutch modules (packages in Java parlance). Modules in the same box are interdependent (for example, fetcher and parse each depend on the other) and dependent on modules in boxes below them (for example, fs depends on ipc, io, and util, but not net or plugin).

tools

 

JSP UI

 

 

 

indexer searcher analysis

 

 

 

 

 

Lucene

fetcher parse

 

 

 

 

 

 

fs <!-- the width=20

 

 

 

 

 

ipc

 

 

 

 

 

 

 

io

 

 

 

 

 

util

 

 

 

 

 

 

 

 

 

 

 

(This text and diagram are up-to-date as of Nutch 0.5.)

There are a few things omitted from this diagram. The html, pagedb, and linkdb packages have been omitted entirely as uninteresting. MAX_OUTLINKS_PER_PAGE is in tools.UpdateDatabaseTool but is used by several modules further down the stack; and util.ScoreStats, which depends on db and therefore transitively on net and io (question), but upon which nothing depends, should probably be in tools instead of in util.

Also, plugins provide much of Nutch's functionality, and they are completely omitted from this diagram.

  • No labels