Working with Hadoop under Eclipse
Here are instructions for setting up a development environment for Hadoop under the Eclipse IDE. Please feel free to make additions or modifications to this page.
This document (currently) assumes you already have Eclipse downloaded, installed, and configured to your liking.
Screencast from Cloudera: Step-by-step walk-through complete with techno background music.
Download and install the Subversive plug-in
Subversive helps you manage an SVN checkout in Eclipse. It's not strictly necessary, but the integration is handy.
The easiest way to download and install is to use Eclipse's Update Manager. That process is well described one the Subversive's site. Think to add both two update sites: "Subversive plug-in" and "Subversive SVN Connectors plug-in".
Specifically, you'll want to add the following "update sites" to
http://www.polarion.org/projects/subversive/download/eclipse/2.0/update-site/ -- "SVN Connectors Site"
http://download.eclipse.org/technology/subversive/0.7/update-site/ -- "Subversive Site"
You'll need to install:
- Subversive SVN Connectors
- Subversive SVN Team Provider (Incubation)
- SVNKit 1.1.7 Implementation (Optional) -- You have a choice of versions here. Use 1.1.7 if your svn is 1.4; use 1.2.2 if your svn is 1.5.
Associate the Hadoop Trunk Repository
Select File > New > Other...
Then SVN > Repository Location wizard
- Based on needs, use one of the following as the Root URL.
- I set a custom label of "Hadoop".
- The repository will show up under "SVN Repositories" Perspective (select "Open Perspective.")
Create a Project
From the SVN Repositories perspective:
- Turn off "Project...Build Automatically"; it slows things down for this step.
Right-click Hadoop > "Trunk" and select "Find/Check Out As..."
- Check out as a project configured using the New Project Wizard
- Java Project
- Project Name: "Hadoop"
Be sure to change the "Default output folder" (on the second page of the "New Java Project" wizard) to PROJECT_NAME/build/eclipse-classes. If you use the default (PROJECT_NAME/bin), Eclipse has a tendency to blow away the handy scripts in that directory.
Using Subversive with already `checkout`ed projects
Refer to the Subversive FAQ.
Note: Using Subversive is optional. You can point Eclipse to an existing source checkout by selecting "Create project from existing source" in the "New Java Project" wizard. Setup the project the same way you would if you were doing a fresh checkout (see above).
Configuring Eclipse to build Hadoop
As of 28 March 2008 there is an ant task for generating the requisite Eclipse files (see HADOOP-1228). Follow these instructions to configure Eclipse:
Set up an ANT_HOME Classpath Variable in Eclipse Preferences. This is a global Eclipse setting so you only need to do this once.
- Checkout Hadoop.
Run the eclipse-files and compile-core-test ant targets (right click build.xml, choose Run As > Ant Build..., click "sort targets", and check the eclipse-files and compile-core-test targets). If you can't click the Run button because of an error that says your .class file version is incompatible with 1.6, then you'll need to click the JRE tab and pick a 1.6-compatible JRE.
- Refresh the Eclipse project. (Hit F5 or right-click on the project, and choose "Refresh")
If your default Eclipse JRE is not 1.6, go to Project > Properties > Java Build Path > Libraries, select the JRE System Library, click Edit, and select a 1.6 or later JRE that's installed on your system. You may also want to set the Java Compiler's JDK Compliance by going to Project > Properties > Java Compiler, check the "Enable project specific settings", and select 6.0 for the Compiler compliance level.
Ensure that the Java version used matches the version of the project (currently 1.6 is used). This could be selected for the project by going to Project > Properties > Builders > Hadoop_Ant_Builders. Go to JRE tab and select an appropriate Java version.
Select Project | Build Project.
Behind the scenes
There is nothing magical about the "eclipse-files" target. It simply copies the files from into your project directory. is the file that needs tweaking most often. In the UI, you edit it "Project...Properties...Java Build Path". Similarly the builder (that invokes Ant) can be configured in the UI in the "Builders" property of the project.
Troubleshooting
Unbound classpath variable: 'ANT_HOME/lib/ant.jar' in project 'hadoop-trunk'
You forgot to setup the ANT_HOME Classpath Variable in Eclipse Preferences. (/usr/share/ant would be a typical setting here.)
The following error occurred while executing this line: .../build.xml:30 The following error occurred while executing this line: .../eclipse-plugin/build.xml:61: Compile failed; see the compiler error output for details
The Eclipse plugin is not compatible with Eclipse 3.4. Because the external builder is running ant directly (as opposed to calling out to a process), eclipse.home is set, and the eclipse-plugin/build.xml is activated. If you need to hack around it, either re-configure the external builder to use an external process or modify the line <target name="check-contrib" unless="eclipse.home"> to reference, say, eclipse.home.foo.
Manual Settings
If you want to build all of Hadoop in Eclipse then there are some DDL files used by the tests that need to compiled first. One strategy is to configure Eclipse to call part of the Ant script to build these, and have two build directories, one for the Ant script, the other for Eclipse, as you need to include the classes built by Ant on the Eclipse library path and circular references are forbidden.
In Eclipse, select Project -> Properties -> Java Build Path -> Source
Then ensure the following source directories are on the Java build path:
hadoop/src/examples hadoop/src/java hadoop/src/test
Then if you want to use the contrib directories as well:
hadoop/src/contrib/test hadoop/src/contrib/abacus/examples hadoop/src/contrib/abacus/src/java hadoop/src/contrib/data_join/src/join hadoop/src/contrib/hbase/src/java hadoop/src/contrib/hbase/src/test hadoop/src/contrib/streaming/src/java hadoop/src/contrib/streaming/src/test
and set the output folder to
hadoop/eclipse-build
Then select Project -> Properties -> Java Build Path -> Libraries
Add all the libraries (.jar files) in
hadoop/lib hadoop/lib/jetty-ext
If you are using contrib also add all the libraries (.jar files) in
hadoop/src/contrib/hbase/lib
Then add the classes in
hadoop/build/test/classes
Then select Project->Properties->Builders
Add a new Ant builder. Select the top level build.xml as the build file. Next select the "targets" tab, after clean specify
compile-core-classes, compile-core test
then after manual build specify
compile-core-classes, compile-core-tests, compile
Apply these changes. Hopefully Hadoop should now build successfully in Eclipse without any errors.