Working with Hadoop under Eclipse

Here are instructions for setting up a development environment for Hadoop under the Eclipse IDE. Please feel free to make additions or modifications to this page.

This document (currently) assumes you already have Eclipse downloaded, installed, and configured to your liking.

Screencast from Cloudera: Step-by-step walk-through complete with techno background music.

Download and install the Subversive plug-in

Subversive helps you manage an SVN checkout in Eclipse. It's not strictly necessary, but the integration is handy.

The easiest way to download and install is to use Eclipse's Update Manager. That process is well described one the Subversive's site. Think to add both two update sites: "Subversive plug-in" and "Subversive SVN Connectors plug-in".

Specifically, you'll want to add the following "update sites" to

You'll need to install:

Associate the Hadoop Trunk Repository

Create a Project

From the SVN Repositories perspective:

Using Subversive with already `checkout`ed projects

Refer to the Subversive FAQ.

Note: Using Subversive is optional. You can point Eclipse to an existing source checkout by selecting "Create project from existing source" in the "New Java Project" wizard. Setup the project the same way you would if you were doing a fresh checkout (see above).

Configuring Eclipse to build Hadoop

As of 28 March 2008 there is an ant task for generating the requisite Eclipse files (see HADOOP-1228). Follow these instructions to configure Eclipse:

  1. Set up an ANT_HOME Classpath Variable in Eclipse Preferences. This is a global Eclipse setting so you only need to do this once.

  2. Checkout Hadoop.
  3. Run the eclipse-files and compile-core-test ant targets (right click build.xml, choose Run As > Ant Build..., click "sort targets", and check the eclipse-files and compile-core-test targets). If you can't click the Run button because of an error that says your .class file version is incompatible with 1.6, then you'll need to click the JRE tab and pick a 1.6-compatible JRE.

  4. Refresh the Eclipse project. (Hit F5 or right-click on the project, and choose "Refresh")
  5. If your default Eclipse JRE is not 1.6, go to Project > Properties > Java Build Path > Libraries, select the JRE System Library, click Edit, and select a 1.6 or later JRE that's installed on your system. You may also want to set the Java Compiler's JDK Compliance by going to Project > Properties > Java Compiler, check the "Enable project specific settings", and select 6.0 for the Compiler compliance level.

  6. Ensure that the Java version used matches the version of the project (currently 1.6 is used). This could be selected for the project by going to Project > Properties > Builders > Hadoop_Ant_Builders. Go to JRE tab and select an appropriate Java version.

  7. Select Project | Build Project.

Behind the scenes

There is nothing magical about the "eclipse-files" target. It simply copies the files from .eclipse.templates into your project directory. .classpath is the file that needs tweaking most often. In the UI, you edit it "Project...Properties...Java Build Path". Similarly the builder (that invokes Ant) can be configured in the UI in the "Builders" property of the project.

Troubleshooting

You forgot to setup the ANT_HOME Classpath Variable in Eclipse Preferences. (/usr/share/ant would be a typical setting here.)

The Eclipse plugin is not compatible with Eclipse 3.4. Because the external builder is running ant directly (as opposed to calling out to a process), eclipse.home is set, and the eclipse-plugin/build.xml is activated. If you need to hack around it, either re-configure the external builder to use an external process or modify the line <target name="check-contrib" unless="eclipse.home"> to reference, say, eclipse.home.foo.

Manual Settings

If you want to build all of Hadoop in Eclipse then there are some DDL files used by the tests that need to compiled first. One strategy is to configure Eclipse to call part of the Ant script to build these, and have two build directories, one for the Ant script, the other for Eclipse, as you need to include the classes built by Ant on the Eclipse library path and circular references are forbidden.

In Eclipse, select Project -> Properties -> Java Build Path -> Source

Then ensure the following source directories are on the Java build path:

hadoop/src/examples
hadoop/src/java
hadoop/src/test

Then if you want to use the contrib directories as well:

hadoop/src/contrib/test
hadoop/src/contrib/abacus/examples
hadoop/src/contrib/abacus/src/java
hadoop/src/contrib/data_join/src/join
hadoop/src/contrib/hbase/src/java
hadoop/src/contrib/hbase/src/test
hadoop/src/contrib/streaming/src/java
hadoop/src/contrib/streaming/src/test

and set the output folder to

hadoop/eclipse-build

Then select Project -> Properties -> Java Build Path -> Libraries

Add all the libraries (.jar files) in

hadoop/lib
hadoop/lib/jetty-ext

If you are using contrib also add all the libraries (.jar files) in

hadoop/src/contrib/hbase/lib

Then add the classes in

hadoop/build/test/classes

Then select Project->Properties->Builders

Add a new Ant builder. Select the top level build.xml as the build file. Next select the "targets" tab, after clean specify

compile-core-classes, compile-core test

then after manual build specify

compile-core-classes, compile-core-tests, compile

Apply these changes. Hopefully Hadoop should now build successfully in Eclipse without any errors.

EclipseEnvironment (last edited 2009-09-20 23:54:35 by localhost)