Differences between revisions 251 and 252
Revision 251 as of 2012-08-13 08:31:36
Size: 5307
Editor: 89
Comment: fix previous added link
Revision 252 as of 2012-08-27 13:09:23
Size: 5506
Comment:
Deletions are marked like this. Additions are marked like this.
Line 4: Line 4:
Please contribute your knowledge about Nutch here! <<TableOfContents(3)>> Please contribute your knowledge about Nutch here! <<TableOfContents(4)>>
Line 13: Line 13:

==== Nutch 1.X tutorial(s) ====
Line 14: Line 16:

==== Nutch 2.X tutorial(s) ====
Line 15: Line 19:
 * [[http://nlp.solutions.asia/?p=180|Setting up Nutch 2.0 with MySQL to handle UTF-8]] - A step-by-step tutorial
 * [[http://www.covert.io/post/18414889381/accumulo-nutch-and-gora]] - Accumulo, Nutch, and Gora

==== Other Tutorial(s) ====
Line 69: Line 77:
 * [[http://nlp.solutions.asia/?p=180|Setting up Nutch 2.0 with MySQL to handle UTF-8]] - A step-by-step tutorial

Welcome to the Apache Nutch Wiki

http://www.interadvertising.co.uk/files/nutch_logo_medium.gif

Please contribute your knowledge about Nutch here!

Nutch Version Administration

Tutorials

Nutch 1.X tutorial(s)

  • NutchTutorial - How to configure Nutch to crawl in local mode and post to Apache Solr for search/index.

Nutch 2.X tutorial(s)

Other Tutorial(s)

  • Hadoop Tutorial Nutch being based Hadoop, it helps to have a better understanding of Hadoop.

  • Nutch Hadoop Tutorial - How to setup and run Nutch in deploy mode over a Hadoop cluster.

  • RunNutchInEclipse - How to configure, build, crawl and debug Nutch within Eclipse

  • Intranet Document Search - Index and search Microsoft Office, PDF etc. documents in a file system hierarchy with a Solr backend.

Configuration

General Information

Nutch Development

Nutch 2.0

Pre Nutch 1.3 and Archive

How to edit this Wiki

This Wiki is a collaborative site, anyone can contribute and share:

  • Create an account by clicking the "Login" link at the top of any page, and picking a username and password.
  • Edit any page by pressing Edit at the top or the bottom of the page

There are some conventions used on the Nutch wiki:

  • /!\ :TODO: /!\ (/!\ :TODO: /!\ ) is used to denote sections that definitely need to be cleaned up.

Some general info on using this Wiki Software:

FrontPage (last edited 2018-09-27 15:44:39 by RoannelFernandez)