===Problem (exception) / Solution pairs===


P:Not a known field name:DEFAULT

S: Add plugin

in nutch-default.xml


P: java.lang.NullPointerException at java.io.Reader.(Reader.java:61) ... at org.apache.nutch.analysis.CommonGrams.init(CommonGrams.java:152) at

S: the file common-terms.utf8 needs to be in the right directory (lib | classes?)


P: Bad mapred.job.tracker: local

S: if you want to run crawl without hdfs you can omit start-all.sh


P: ... getlocalpath NullPointerException

S: check mapred.local.dir and other tmp dirs in nutch-default.xml / hadoop-default.xml


P:extension point: org.apache.nutch.net.URLNormalizer does not exist

S:check your plugins + plugin.includes settings and add urlnormalizer-regex or urlnormalizer-(pass|regex|basic) --- P:java.net.UnknownHostException "hostname"

S: add 127.0.0.1 "hostname" to the /etc/hosts file.


P: ...[null] MalformedUrlException

S: add common-terms.utf8 to nutch dir


P: java.lang.ClassCastException: org.apache.hadoop.io.Text

S: wrong hadoop version / patch http://files.pannous.de/org.rar


P:java.lang.NoSuchMethodError: org.apache.hadoop.io.MapFile $Writer.

S: wrong hadoop version / patch http://files.pannous.de/org.rar


P: NullPointerException when crawling :

S: add to nutch-site.xml: <property>

</property>


P: java.io.IOException: config()

S: ignore it ! ;)


P: nutch crawl ... Job Failed!

S: manifold. set log4j.properties debug level ! log4j.rootLogger=ALL, stdout log4j.appender.stdout=org.apache.log4j.ConsoleAppender


P: No scoring plugins - at least one scoring plugin is required!

S: Add "scoring-opic" to <property> <name>plugin.includes</name>


P: ... java.net.SocketTimeoutException: Accept timed out

S: try using nutch without hdfs / check ports in hadoop file / RPC problems : start crawl without startall.sh ?


P: java.lang.NoClassDefFoundError xyz on windows

S: get rid of spaces in your classpath and path variables !


solved_problems (last edited 2009-09-20 23:09:34 by localhost)