Unknown Host

You get an Unknown Host Error -often wrapped in a Java IOException, when one machine on the network cannot determine the IP address of a host that it is trying to connect to by way of its hostname. This can happen during file upload (in which case the client machine is has the hostname problem), or inside the Hadoop cluster.

Some possible causes (not an exclusive list):

These are all network configuration/router issues. As it is your network, only you can find out and track down the problem. That said, any tooling to help Hadoop track down such problems in cluster would be welcome, as would extra diagnostics. If you have to extend Hadoop to track down these issues -submit your patches!

Some tactics to help solve the problem:

  1. Look for configuration problems first (Hadoop XML files, hostnames, host tables), as these are easiest to fix and quite common.
  2. Try and identify which client machine is playing up. If it is out-of-cluster, try the FQDN instead, and consider that it may not have access to the worker node.
  3. If the client that does not work is one of the machines in the cluster, SSH to that machine and make sure it can resolve the hostname.
  4. As well as nslookup, the dig command is invaluable for tracking down DNS problems, though it does assume you understand DNS records. Now is a good time to learn.

  5. Restart the JVMs to see if that makes it go away.
  6. Restart the servers to see if that makes it go away.
  7. Reboot the network switches.

Remember, unless the route cause has been identified, the problem may return.

UnknownHost (last edited 2014-01-03 11:51:21 by SteveLoughran)