Title :

GSoC 2017 Weekly Reports

 

Issue :

 

NUTCH-2369 - Graph Generator Tool for Nutch

<ac:structured-macro ac:name="unmigrated-wiki-markup" ac:schema-version="1" ac:macro-id="a5a0a2e7-d3f1-4d5d-8d66-21e7241b49df"><ac:plain-text-body><![CDATA[

Student :

Omkar Reddy - omkarr [at] apache dot org

 

]]></ac:plain-text-body></ac:structured-macro>

Mentor :

Lewis John McGibbney

 


Week 1: Community Bonding Period(1st May – 7th May)

Previous Action Items : None

Weekly Activity :

  • Started upgrading the nutch code base.
  • Updated crawldb to some extent.
  • Since it was the beginning of the update lots of time was taken to study the changes.

Next Week’s agenda : Complete upgrading crawldb

Mentor’s comments :


Week 2: Community Bonding Period(8th May – 14th May)

Previous Action Items : Upgrading crawldb.

Weekly Activity :

  • There were blockers in the code so could not commit anything in this week.
  • Blockers were discussed with mentor and solved the same.

Next Week’s agenda : Complete upgrading crawldb.

Mentor’s comments :


Week 3: Community Bonding Period(15th May – 21st May)

Previous Action Items : Upgrading crawldb.

Weekly Activity :

  • Continued upgrading crawldb.

Next Week’s agenda : Complete upgrading crawldb.

Mentor’s comments :


Week 4: Community Bonding Period(22nd May – 28th May)

Previous Action Items : Upgrading crawldb.

Weekly Activity :

  • Completed upgrading crawldb.

Next Week’s agenda : Upgrade rest of the services in nutch code base.

Mentor’s comments :


Week 5: Coding Phase I (29th May – 4th June)

Previous Action Items : Upgrading rest of the services in nutch code base

Weekly Activity :

  • Completed upgrading hostdb, indexer and parser.
  • Completed updating webgraph and segment.

Next Week’s agenda : Upgrade rest of the services in nutch code base.

Mentor’s comments :


Week 6: Coding Phase I (5th June – 11th May)

Previous Action Items : Upgrade rest of the services in nutch code base.

Weekly Activity :

  • Completed upgrading Fetcher

Next Week’s agenda : Upgrade rest of the services in nutch code base.

Mentor’s comments :


Week 7: Coding Phase I (12th June – 18th June)

Previous Action Items : Upgrade rest of the services in nutch code base.

Weekly Activity :

  • Completed updating tools.
  • Removed all the wildcard imports of mapreduce from nutch code base.
  • Updated NutchJob class to use getInstance of Job in Hadoop

Next Week’s agenda : Remove all the errors occurring in the upgrade and start building graph generator tool.

Mentor’s comments :


Week 8: Coding Phase I (19th June – 25th June)

Previous Action Items : Removing errors in the runtime build.

Weekly Activity :

  • Started working on errors occuring during runtime build.
  • Added exception handling while running a hadoop job.

Next Week’s agenda : Remove all the errors in the runtime build and start updating tests and plugins.

Mentor’s comments :


Week 9: Coding Phase II (26th June – 2nd July)

Previous Action Items : Updating tests and plugins.

Weekly Activity :

  • Updated tests and plugins to use mapreduce.

Next Week’s agenda : Remove all the errors occuring during test build.

Mentor’s comments :


Week 10: Coding Phase II (3rd July – 9th July)

Previous Action Items : Remove all the errors occuring during test build.

Weekly Activity :

  • Removed all the errors occuring during test build.
  • Started fixing failing tests.

Next Week’s agenda : Stabilising the tests.

Mentor’s comments :


Week 11: Coding Phase II (10th July – 16th July)

Previous Action Items : Fix the failing tests.

Weekly Activity :

  • Started fixing failing tests.

Next Week’s agenda : Stabilising the tests.

Mentor’s comments :


Week 12: Coding Phase II (17th July – 23rd July)

Previous Action Items : Fix the failing tests.

Weekly Activity :

  • Worked on fixing failing tests.
  • Could not commit anything cause I was brainstorming on fixing the tests.

Next Week’s agenda : Stabilising the tests.

Mentor’s comments :


Week 13: Coding Phase II (17th July – 23rd July)

Previous Action Items : Fix the failing tests.

Weekly Activity :

  • Worked on fixing failing tests.
  • The code broke while fixing the tests and some merge issue, worked on stabilising the runtime build.

Next Week’s agenda : Stabilising the tests.

Mentor’s comments :


Week 14: Coding Phase II (24th July – 30th July)

Previous Action Items : Fix the failing tests.

Weekly Activity :

  • Worked on fixing failing tests.

Next Week’s agenda : Stabilising the tests.

Mentor’s comments :


Week 15: Coding Phase III (1st August – 7th August)

Previous Action Items : Fix the failing tests.

Weekly Activity :

  • Worked on fixing failing tests.

Next Week’s agenda : Stabilising the tests.

Mentor’s comments :


Week 16: Coding Phase III (8th August – 14th August)

Previous Action Items : Fix the failing tests.

Weekly Activity :

  • Worked on fixing failing tests.
  • Could not commit anything as I was working on TestGenerator and stuck on it, same was conveyed to the mentor and got help regarding the issues.

Next Week’s agenda : Stabilising the tests.

Mentor’s comments :


Week 17: Coding Phase III (15th August – 21st August)

Previous Action Items : Fix the failing tests.

Weekly Activity :

  • Worked on fixing failing tests.
  • Made some changes to Fetcher.

Next Week’s agenda : Stabilising the tests.

Mentor’s comments :


Week 18: Coding Phase III (22nd August – 28th August)

Previous Action Items : Fix the failing tests.

Weekly Activity :

  • Worked on fixing failing tests.

Next Week’s agenda : Stabilising the tests.

Mentor’s comments :


Conclusion

  • GSOC might be coming to end in a couple of days, but this project will be extended as long as it takes to complete it.
  • We have over estimated the milestones in the project and made good progress anyway. This upgrade is a good addition to the project.
  • Graph Generator tool will be built on the foundation of this upgrade in the coming days.
  • I would like to thank my mentor and the dev @Nutch for their support on any issues that I have faced.
  • No labels