Differences between revisions 9 and 10
Revision 9 as of 2006-03-22 00:04:29
Size: 3415
Comment:
Revision 10 as of 2009-09-20 23:09:51
Size: 3431
Editor: localhost
Comment: converted to 1.6 markup
Deletions are marked like this. Additions are marked like this.
Line 6: Line 6:
[[MailTo(jerome.charron AT PASDEPOURRIELS gmail DOT com)]] <<MailTo(jerome.charron AT PASDEPOURRIELS gmail DOT com)>>
Line 21: Line 21:
   * '''TODO''': Replace the current XML descriptor by the [http://freedesktop.org/wiki/Standards_2fshared_2dmime_2dinfo_2dspec#head-0efc2e6be4c23b9a513d7ce0dcff8ed80e8912e7 Freedesktop shared-mime-info-spec] one    * '''TODO''': Replace the current XML descriptor by the [[http://freedesktop.org/wiki/Standards_2fshared_2dmime_2dinfo_2dspec#head-0efc2e6be4c23b9a513d7ce0dcff8ed80e8912e7|Freedesktop shared-mime-info-spec]] one
Line 28: Line 28:
     * See also Andrzej [http://www.nabble.com/Re%3A-lang-identifier-and-nutch-analyzer-in-trunk-p2533535.html comments] :      * See also Andrzej [[http://www.nabble.com/Re%3A-lang-identifier-and-nutch-analyzer-in-trunk-p2533535.html|comments]] :
Line 46: Line 46:
 * [http://microformats.org/ Microformats] HtmlParseFilter:
   * [http://microformats.org/wiki/rel-tag rel-tag] (see microformats-reltag plugin)
   * '''TODO''' [http://microformats.org/wiki/hreview hreview]
 * [[http://microformats.org/|Microformats]] HtmlParseFilter:
   * [[http://microformats.org/wiki/rel-tag|rel-tag]] (see microformats-reltag plugin)
   * '''TODO''' [[http://microformats.org/wiki/hreview|hreview]]
Line 50: Line 50:
 * Nutch [http://fr.wikipedia.org/wiki/Nutch article] on french wikipedia.  * Nutch [[http://fr.wikipedia.org/wiki/Nutch|article]] on french wikipedia.
Line 52: Line 52:
   * Add a ''mini framework'' plugin for regular expression based URL Filters ([http://svn.apache.org/viewcvs.cgi/lucene/nutch/trunk/src/plugin/lib-regex-filter/ lib-regex-filter])
   * Add a regex url filter implementation based on [http://www.brics.dk/automaton/ dk.brics.automaton] Finite-State Automata for Java.
   * Add a ''mini framework'' plugin for regular expression based URL Filters ([[http://svn.apache.org/viewcvs.cgi/lucene/nutch/trunk/src/plugin/lib-regex-filter/|lib-regex-filter]])
   * Add a regex url filter implementation based on [[http://www.brics.dk/automaton/|dk.brics.automaton]] Finite-State Automata for Java.

Jerome Charron

<jerome.charron AT PASDEPOURRIELS gmail DOT com>

Activities

Nutch contributions


CategoryHomepage

JeromeCharron (last edited 2009-09-20 23:09:51 by localhost)