Differences between revisions 10 and 11
Revision 10 as of 2004-12-22 08:00:51
Size: 3720
Editor: NicolaKen
Comment: Mentors are created equal :-)
Revision 11 as of 2009-09-20 23:05:27
Size: 3730
Editor: localhost
Comment: converted to 1.6 markup
Deletions are marked like this. Additions are marked like this.
Line 8: Line 8:
[http://www.nutch.org/ Nutch] is web search software. It builds on the Apache Lucene search library, adding a crawler, web database (including full link graph), plugins for various document formats, user interface, etc. It is currently used by sites such as http://search.creativecommons.org/, http://library.cornell.edu/, and the Internet Archive. [[http://www.nutch.org/|Nutch]] is web search software. It builds on the Apache Lucene search library, adding a crawler, web database (including full link graph), plugins for various document formats, user interface, etc. It is currently used by sites such as http://search.creativecommons.org/, http://library.cornell.edu/, and the Internet Archive.
Line 10: Line 10:
[http://www.nutch.org/ Nutch] is a two-year-old open source project, currently hosted at Sourceforge and backed by its own non-profit organization. The non-profit was founded in order to assign copyright, so that we could retain the right to change the license. We have now determined that the Apache license is the appropriate license for Nutch and no longer require the overhead of an independent non-profit organization. Nutch's [http://www.nutch.org/docs/en/org.html board of directors] and its developers have both been polled and support a move to the Apache foundation. [[http://www.nutch.org/|Nutch]] is a two-year-old open source project, currently hosted at Sourceforge and backed by its own non-profit organization. The non-profit was founded in order to assign copyright, so that we could retain the right to change the license. We have now determined that the Apache license is the appropriate license for Nutch and no longer require the overhead of an independent non-profit organization. Nutch's [[http://www.nutch.org/docs/en/org.html|board of directors]] and its developers have both been polled and support a move to the Apache foundation.
Line 18: Line 18:
Nutch's developers are already comfortable operating as a meritocracy. Nutch's [http://www.nutch.org/docs/en/policies.html current developer policies] are a bit more informal than that of Apache, but, then, there have never been any notable conflicts to resolve. Nutch's developers are already comfortable operating as a meritocracy. Nutch's [[http://www.nutch.org/docs/en/policies.html|current developer policies]] are a bit more informal than that of Apache, but, then, there have never been any notable conflicts to resolve.
Line 22: Line 22:
Nutch has an [http://www.mail-archive.com/nutch-developers@lists.sourceforge.net/ established and active] developer community. Nutch has an [[http://www.mail-archive.com/nutch-developers@lists.sourceforge.net/|established and active]] developer community.

Proposal for new project Nutch

Doug Cutting -- cutting at apache dot org


(0) rationale

Nutch is web search software. It builds on the Apache Lucene search library, adding a crawler, web database (including full link graph), plugins for various document formats, user interface, etc. It is currently used by sites such as http://search.creativecommons.org/, http://library.cornell.edu/, and the Internet Archive.

Nutch is a two-year-old open source project, currently hosted at Sourceforge and backed by its own non-profit organization. The non-profit was founded in order to assign copyright, so that we could retain the right to change the license. We have now determined that the Apache license is the appropriate license for Nutch and no longer require the overhead of an independent non-profit organization. Nutch's board of directors and its developers have both been polled and support a move to the Apache foundation.

We anticipate that Nutch will join the recently proposed search.apache.org top-level project, with Lucene and its various ports.

(0.1) criteria

Meritocracy:

Nutch's developers are already comfortable operating as a meritocracy. Nutch's current developer policies are a bit more informal than that of Apache, but, then, there have never been any notable conflicts to resolve.

Community:

Nutch has an established and active developer community.

Core Developers:

Nutch has four active committers who are experienced open source developers.

Alignment:

Nutch currently users the following Apache projects: Ant, Lucene, Xerces, POI, commons.

(0.2) warning signs

Orphaned products:

Nutch is not an orphan. It has the same corporate sponsors that it has always had.

Inexperience with open source:

Nutch's committers are experienced with open source.

Homogenous developers:

Nutch's committers do not all share an employer or nation. All decisions are made openly on public mailing lists.

Reliance on salaried developers:

Nutch has no salaried developers.

No ties to other Apache products:

Nutch has strong ties to Lucene.

A fascination with the Apache brand:

Nutch has a strong brand already. While the Apache brand will enhance that, that is not a primary motivation for Nutch to join Apache.

(1) scope of the subprojects

All code is currently licensed under a variant of the Apache License 1.0. The developers have approved a move to the Apache 2.0 license and a re-assignment of copyright to the Apache Foundation. We have signed Contributor License Agreements on file for all developers.

(3) identify the ASF resources to be created

(3.1) mailing list(s)

  • nutch-dev
  • nutch-commits
  • nutch-user
  • nutch-agent

(3.2) Subversion or CVS repositories

(3.3) Jira

  • Nutch (NUTCH)

(4) identify the initial set of committers

  • Doug Cutting (Lucene committer)
  • Michael Cafarella (current Nutch committer at Sourceforge)
  • Andrzej Bialecki (current Nutch committer at Sourceforge)
  • John Xing (current Nutch committer at Sourceforge)
  • Sami Siren (current Nutch committer at Sourceforge)

(5) identify apache sponsoring individual

NutchProposal (last edited 2009-09-20 23:05:27 by localhost)