Goal : Move Nutch 2.x to Hadoop 2.X from existing 1.x codebase.

The following page is a proposal for GSoC 2015 related to issue Nutch-1936

Introduction


*
*

Methodology

Phase 1(Learning & Experimenting):

Phase 2 (Coding):


2.3) Summary of Configuration Changes which I have observed

Phase 3 (Documentation):

  1. Documentation leading to the detailed description of migration of Hadoop framework in Apache Nutch.
  2. Detailed guide for setting up Hadoox 2.x on Nutch 2.x.

Timeline

_Phase 1: _* *

*_ '27 March- 20 April: *I will acquaint myself comprehensively with Nutch Documentation and Hadoop Framework. Experimenting with Nutch by simple web crawling techniques will expose myself completely to Apache Nutch.




Phase 2: _* *

_ '1st June - 25th June: Coding work would be half over, Mid- Term evaluation report submission. Discussion with mentor regarding improvement.



Phase 3:



References: