Hadoop Java Versions

Hadoop requires Java 1.6+. It is built and tested on Oracle (nee Sun) Java, which is the only "supported" JVM.

If you have a problem with Hadoop, and you are running anything other than the official Sun/Oracle JDK, you are likely to be asked to try and run Hadoop on the Oracle JDK/JVM, to see if the problem goes away. That said, patches that help Hadoop run on other JVMs which do not effect the stability or performance of Hadoop on the Oracle JVM are encouraged.

Sun JDK

Hadoop is built and tested on Oracle JDKs. Here are the known JDKs in use or have been tested and their status:

1 - Hadoop works well with update 16 however there is a bug in JDK versions before update 19 that has been seen on HBase. See HBASE-4367 for details.

2 - If the grid is running in secure mode with MIT Kerberos 1.8 and higher, the Java version should be 1.6.0_27 or higher in order to avoid Java bug 6979329.

The Sun JVM has 32-bit and 64-bit modes. In a large cluster the NameNode and JobTracker need to run in 64-bit mode to keep all their data structures in memory. The workers can be set up for either 32-bit or 64-bit operation, depending upon preferences and how much memory the individual tasks need.

Using the Compressed Object References JVM feature (-XX:+UseCompressedOops) reduces memory consumed and increases performance on 64 bit Sun JVMs. This feature was first introduced in 1.6.0_14 but problems have been reported with its use on versions prior to 1.6.0_20. Several have reported success using it on 1.6.0_21 and above. It is the default in 1.6.0_24 and above on 64 bit JVMs.

Useful tips for discovering and inspecting Sun JVM confuguration flags are in the following blog post: inspecting-hotspot-jvm-options

OpenJDK

Hadoop does build and run on OpenJDK (OpenJDK is based on the Sun JDK).

OpenJDK is handy to have on a development system as it has more source for you to step into when debugging something. OpenJDK and Sun JDK mainly differ in (native?) rendering/AWT/Swing code, which is not relevant for any MapReduce Jobs that aren't creating images as part of their work.

Note*: OpenJDK6 has some open bugs w.r.t handling of generics (https://bugs.launchpad.net/ubuntu/+source/openjdk-6/+bug/611284, https://bugs.launchpad.net/ubuntu/+source/openjdk-6/+bug/716959), so OpenJDK cannot be used to compile hadoop mapreduce code in branch-0.23 and beyond, please use other JDKs.

Oracle JRockit

Oracle's JRockit JVM is not the same as the Sun JVM: it has very different heap and memory management behavior. Hadoop has been used on JRockit, though not at "production" scale.

  1. Problems spawning jobs

  2. One of the tests doesn't like JRockit

  3. Log4J configuration issues

IBM JDK

Hadoop 0.20.2 has been tested comprehensively and works with IBM Java 6 SR 8. IBM Java can be downloaded here.

Hadoop 0.20.20x, 0.21.0, and trunk use a few Sun-specific APIs which IBM Java does not provide. HADOOP-6941 and HADOOP-7211 have been filed to support non-Sun/Oracle Java.

A request for help from JVM/JDK developers

We would strongly encourage anyone who produces a JVM/JDK to test compiling and running Hadoop with it. It makes for a fantastic performance and stress test. As Hadoop is becoming a key back-end datacenter application, good Hadoop support matters.

HadoopJavaVersions (last edited 2012-04-26 21:57:44 by OwenOMalley)