Some problems encountered in Hadoop and ways to go about solving them. See also NameNodeFailover and ConnectionRefused.
ERROR org.apache.hadoop.dfs.NameNode: java.io.EOFException at java.io.DataInputStream.readFully(DataInputStream.java:178) at org.apache.hadoop.io.UTF8.readFields(UTF8.java:106) at org.apache.hadoop.io.ArrayWritable.readFields(ArrayWritable.java:90) at org.apache.hadoop.dfs.FSEditLog.loadFSEdits(FSEditLog.java:433) at org.apache.hadoop.dfs.FSImage.loadFSEdits(FSImage.java:759) at org.apache.hadoop.dfs.FSImage.loadFSImage(FSImage.java:639) at org.apache.hadoop.dfs.FSImage.recoverTransitionRead(FSImage.java:222) at org.apache.hadoop.dfs.FSDirectory.loadFSImage(FSDirectory.java:79) at org.apache.hadoop.dfs.FSNamesystem.initialize(FSNamesystem.java:254) at org.apache.hadoop.dfs.FSNamesystem.<init>(FSNamesystem.java:235) at org.apache.hadoop.dfs.NameNode.initialize(NameNode.java:131) at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:176) at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:162) at org.apache.hadoop.dfs.NameNode.createNameNode(NameNode.java:846) at org.apache.hadoop.dfs.NameNode.main(NameNode.java:855) |
This is sometimes encountered if there is a corruption of the
edits |
file in the transaction log. Try using a hex editor or equivalent to open up 'edits' and get rid of the last record. In all cases, the last record might not be complete so your NameNode is not starting. Once you update your edits, start the NameNode and run
hadoop fsck / |
to see if you have any corrupt files and fix/get rid of them.
Take a back up of
dfs.name.dir |
before updating and playing around with it.
There are a number of possible of causes for this.
Your logs contain something like
INFO hdfs.DFSClient: Could not obtain block blk_-4157273618194597760_1160 from any node: java.io.IOException: No live nodes contain current block |
There are no live DataNode nodes containing a copy of the block of the file you are looking for. Bring up any nodes that are down, or skip that block.
This can be a DNS issue. Two problems which have been encountered in practice are:
dfs.datanode.dns.interface |
hdfs-site.xml |
mapred.datanode.dns.interface |
mapred-site.xml |
eth0 |
/etc/hosts |
/etc/resolv.conf |
(or any similar number such as "2 nodes instead of 3")
See ServerNotAvailable.
See TooManyOpenFiles