Title :

GSoC 2017 Weekly Reports

Issue :

NUTCH-2369 - Graph Generator Tool for Nutch

Student :

Omkar Reddy - omkarr [at] apache dot org

Mentor :

Lewis John McGibbney


Week 1: Community Bonding Period(1st May – 7th May)

Previous Action Items : None

Weekly Activity :

- Started upgrading the nutch code base.

- Updated crawldb to some extent.

- Since it was the beginning of the update lots of time was taken to study the changes.

- Commits : Updating crawldb : part01

Next Week’s agenda : Complete upgrading crawldb

Mentor’s comments :


Week 2: Community Bonding Period(8th May – 14th May)

Previous Action Items : Upgrading crawldb.

Weekly Activity :

- There were blockers in the code so could not commit anything in this week.

- Blockers were discussed with mentor and solved the same.

Next Week’s agenda : Complete upgrading crawldb.

Mentor’s comments :


Week 3: Community Bonding Period(15th May – 21st May)

Previous Action Items : Upgrading crawldb.

Weekly Activity :

- Continued upgrading crawldb.

- Commits : Crawldb update : part02

Next Week’s agenda : Complete upgrading crawldb.

Mentor’s comments :


Week 4: Community Bonding Period(22nd May – 28th May)

Previous Action Items : Upgrading crawldb.

Weekly Activity :

- Completed upgrading crawldb.

- Commits : Updating Crawldb : part03

Next Week’s agenda : Upgrade rest of the services in nutch code base.

Mentor’s comments :


Week 5: Coding Phase I (29th May – 4th June)

Previous Action Items : Upgrading rest of the services in nutch code base

Weekly Activity :

- Completed upgrading hostdb, indexer and parser.

- Completed updating webgraph and segment.

- Commits : Updating hostdb/ indexer/ and parse/ and Updating scoring/webgraph/ and segment/

Next Week’s agenda : Upgrade rest of the services in nutch code base.

Mentor’s comments :


Week 6: Coding Phase I (5th June – 11th May)

Previous Action Items : Upgrade rest of the services in nutch code base.

Weekly Activity :

- Completed upgrading Fetcher

- Commits : Updating /Fetcher/

Next Week’s agenda : Upgrade rest of the services in nutch code base.

Mentor’s comments :


Week 7: Coding Phase I (12th June – 18th June)

Previous Action Items : Upgrade rest of the services in nutch code base.

Weekly Activity :

- Completed updating tools.

- Removed all the wildcard imports of mapreduce from nutch code base.

- Updated NutchJob class to use getInstance of Job in Hadoop

- Commits : Updating tools/ , Removing wildcard imports from mapreduce imports , Updating NutchJob to use getInstance()

Next Week’s agenda : Remove all the errors occurring in the upgrade and start building graph generator tool.

Mentor’s comments :


Week 8: Coding Phase I (19th June – 25th June)

Previous Action Items : Removing errors in the runtime build.

Weekly Activity :

- Started working on errors occuring during runtime build.

- Added exception handling while running a hadoop job.

- Commits : Addressing the errors and Adding exception handling..

Next Week’s agenda : Remove all the errors in the runtime build and start updating tests and plugins.

Mentor’s comments :


Week 9: Coding Phase II (26th June – 2nd July)

Previous Action Items : Updating tests and plugins.

Weekly Activity :

- Updated tests and plugins to use mapreduce.

- Commits : Updating plugins and test files..

Next Week’s agenda : Remove all the errors occuring during test build.

Mentor’s comments :


Week 10: Coding Phase II (3rd July – 9th July)

Previous Action Items : Remove all the errors occuring during test build.

Weekly Activity :

- Removed all the errors occuring during test build.

- Started fixing failing tests.

- Commits : Fixing errors in the tests. , Addressing the errors and Fixing failing tests : Update01

Next Week’s agenda : Stabilising the tests.

Mentor’s comments :


Week 11: Coding Phase II (10th July – 16th July)

Previous Action Items : Fix the failing tests.

Weekly Activity :

- Started fixing failing tests.

- Fixed CrawlDbFilter.

- Commits : Fixing failing tests : Update02 Stabilizing TestCrawlDbFilter.

Next Week’s agenda : Stabilising the tests.

Mentor’s comments :


Week 12: Coding Phase II (17th July – 23rd July)

Previous Action Items : Fix the failing tests.

Weekly Activity :

- Worked on fixing failing tests.

- Could not commit anything cause I was brainstorming on fixing the tests.

Next Week’s agenda : Stabilising the tests.

Mentor’s comments :


Week 13: Coding Phase II (17th July – 23rd July)

Previous Action Items : Fix the failing tests.

Weekly Activity :

- Worked on fixing failing tests.

- The code broke while fixing the tests and some merge issue, worked on stabilising the runtime build.

- Commits : Stabilizing the runtime build.

Next Week’s agenda : Stabilising the tests.

Mentor’s comments :


Week 14: Coding Phase II (24th July – 30th July)

Previous Action Items : Fix the failing tests.

Weekly Activity :

- Worked on fixing failing tests.

- Fixed the test cases : TestPluginSystem, TestCrawlDbMerger, TestLinkDbMerger.

- Commits : Fixing the tests TestPluginSystem TestCrawlDbMerger TestLinkDbMerger

Next Week’s agenda : Stabilising the tests.

Mentor’s comments :


Week 15: Coding Phase III (1st August – 7th August)

Previous Action Items : Fix the failing tests.

Weekly Activity :

- Worked on fixing failing tests.

- Fixed the test cases : TestIndexerMapreduce.

- Made some changes in SegmentMerger.

- Commits : Fixing the test TestIndexerMapReduce, Making changes to SegmentMerger.

Next Week’s agenda : Stabilising the tests.

Mentor’s comments :


Week 16: Coding Phase III (8th August – 14th August)

Previous Action Items : Fix the failing tests.

Weekly Activity :

- Worked on fixing failing tests.

- Could not commit anything as I was working on TestGenerator and stuck on it, same was conveyed to the mentor and got help regarding the issues.

Next Week’s agenda : Stabilising the tests.

Mentor’s comments :


Week 17: Coding Phase III (15th August – 21st August)

Previous Action Items : Fix the failing tests.

Weekly Activity :

- Worked on fixing failing tests.

- Fixed the test cases : TestGenerator.

- Made some changes to Fetcher.

- Commits : Fixing the testcase TestGenerator., Minor code enhancements.

Next Week’s agenda : Stabilising the tests.

Mentor’s comments :


Week 18: Coding Phase III (22nd August – 28th August)

Previous Action Items : Fix the failing tests.

Weekly Activity :

- Worked on fixing failing tests.

- Fixed the test cases : TestFetcher.

- Commits : Fixing TestFetcher

Next Week’s agenda : Stabilising the tests.

Mentor’s comments :


Conclusion

- GSOC might be coming to end in a couple of days, but this project will be extended as long as it takes to complete it.

- We have over estimated the milestones in the project and made good progress anyway. This upgrade is a good addition to the project.

- Graph Generator tool will be built on the foundation of this upgrade in the coming days.

- I would like to thank my mentor and the dev @Nutch for their support on any issues that I have faced.

GoogleSummerOfCode/GraphGeneratorTool/WeeklyReports (last edited 2017-08-27 06:17:02 by OmkarReddy)