Upgrade From Nutch 0.7 To Nutch 0.8
see the Tutorial
- put your root urls in urls/whatever_name instead of urls
- make sure you set up http.agent.name
Unfortunately, the data is not portable between these versions. The only thing you could do to preserve your webdb is to dump it into a text file, and then inject into a 0.8 crawldb. As for the segments, you will have to refetch them.