Differences between revisions 7 and 8
Revision 7 as of 2006-04-21 19:43:54
Size: 387
Editor: DanielNaber
Comment: typos, update FAQ link
Revision 8 as of 2009-09-20 21:47:58
Size: 397
Editor: localhost
Comment: converted to 1.6 markup
Deletions are marked like this. Additions are marked like this.
Line 3: Line 3:
Lucene doesn't directly support this, you need to use a spider like [http://regain.sourceforge.net regain], [http://www.searchblox.com SearchBlox] or [http://www.nutch.org Nutch] to accomplish this. Lucene doesn't directly support this, you need to use a spider like [[http://regain.sourceforge.net|regain]], [[http://www.searchblox.com|SearchBlox]] or [[http://www.nutch.org|Nutch]] to accomplish this.
Line 5: Line 5:
[http://www.httrack.com HTTrack] is a useful, free spider with many features.
Also see the Lucene [http://wiki.apache.org/jakarta-lucene/LuceneFAQ FAQ]
[[http://www.httrack.com|HTTrack]] is a useful, free spider with many features.
Also see the Lucene [[http://wiki.apache.org/jakarta-lucene/LuceneFAQ|FAQ]]

How to index a web site

Lucene doesn't directly support this, you need to use a spider like regain, SearchBlox or Nutch to accomplish this.

HTTrack is a useful, free spider with many features. Also see the Lucene FAQ

IndexingWebPages (last edited 2009-09-20 21:47:58 by localhost)