How to Contribute to Solr

"Contributing" to an Apache project is about more then just writing code -- it's about doing what you can to make the project better. There are lots of ways to contribute....

Be Involved

Contributors should join the Solr mailing lists. In particular:

Please keep discussions about Solr on list so that everyone benefits. Emailing individual committers with questions about specific Solr issues is discouraged. See http://people.apache.org/~hossman/#private_q.

Write/Improve User Documentation

Solr can always use more/better documentation targeted at end users, most of which is in this wiki where anyone can edit it. If you see a gap in the Solr documentation, fill it in. Even if you don't know exactly what to say, ask on the user list and you'll probably get a lot of great responses -- talking informally about how Solr works is something lots of people tend to have time for, but aggregating all of that info into concise cohesive documentation takes a little more work/patience.

If there is a patch in Jira that you think is really great, writing some "user guide" style docs about how it works (or is suppose to work) in the wiki is a great way to help the patch get committed: It helps serve as a road map for what the "goal" of the issue is, what should be possible for users to do once the issue is resolved; it helps get people who may not understand the low level details get excited about the new functionality; and it can eventually evolve into the final documentation once the code is committed. (just make sure to link to the issue so people who find your wiki page first know it's not included in Solr's main code line yet).

Contributing Code (Features, Big Fixes, Tests, etc...)

This section identifies the optimal steps community member can take to submit a changes or additions to the Solr code base. This can be new features, bug fixes optimizations of existing features, or tests of existing code to prove it works as advertised (and to make it more robust against possible future changes).

Please note that these are the "optimal" steps, and community members that don't have the time or resources to do everything outlined on this below should not be discouraged from submitting their ideas "as is" per "Yonik's Law of Patches" ...

A half-baked patch in Jira, with no documentation, no tests
and no backwards compatibility is better than no patch at all.

Just because you may not have the time to write unit tests, or cleanup backwards compatibility issues, or add documentation, doesn't mean other people don't. Putting your patch out there allows other people to try it and possibly improve it.

Getting the source code

First of all, you need the Solr source code.

Get the source code on your local drive using SVN. Most development is done on the "trunk":

> svn checkout http://svn.apache.org/repos/asf/lucene/dev/trunk

Note that committers have to use https instead of http here, but http is fine for read-only access to the trunk code.

Making Changes

Before you start, you should send a message to the Solr developer mailing list (Note: you have to subscribe before you can post), or file a bug in Jira. Describe your proposed changes and check that they fit in with what others are doing and have planned for the project. Be patient, it may take folks a while to understand your requirements.

Modify the source code and add some (very) nice features using your favorite IDE.

But take care about the following points

Notes for Eclipse and the New Merged Lucene/Solr checkout

Having trouble getting the new Lucene/Solr checkout to work in Eclipse? Do you see some errors having to do with:

org.w3c.dom.Node#getTextContent() not found

in the solr/src/test/TestConfig.java file? This has to do with the Tidy.jar library which includes its own version of the Node API. By removing Tidy.jar from your Eclipse classpath, you can obviate this problem.

Another issue you may see is character encoding, especially if you are including the lucene/contrib/analyzers package. On Eclipse, you can solve this by going to Project > Properties > Resources, and then changing the encoding on that page to UTF-8. Then let Eclipse rebuild the workspace (either automatically if you have that checked, or force a rebuild) and you should be golden!

(thanks to Mark Miller and Erik Hatcher for contributing to this)

Eclipse .classpath and .project files for Lucene/Solr projects are attached to this email from the Solr mailing list.

Generating a patch

A "patch file" is the format that all good contributions come in. It bundles up everything that is being added, removed, or changed in your contribution.

Unit Tests

Please make sure that all unit tests succeed before constructing your patch.

> cd solr-trunk
> ant clean test

After a while, if you see

BUILD SUCCESSFUL

all is ok, but if you see

BUILD FAILED

please, read carefully the errors messages and check your code. If the test fails you may want to repeatedly rerun a single test as you debug and sort out any problems. In which case you could run

> ant -Dtestcase=TestXXX test

Where "TestXXX" is the name of the particular Junit test you want to run.

Creating the patch file

Check to see what files you have modified with:

svn stat

Add any new files with:

svn add src/.../MyNewClass.java

Subversions "add" command only modifies your local copy, so it does not require commit permissions. By using "svn add", your entire contribution can be included in a single patch file, without needing to submit a separate set of "new" files.

Edit the CHANGES.txt file, adding a description of your change, including the bug number it fixes.

In order to create a patch, just type:

svn diff > SOLR-NNN.patch

This will report all modifications done on Solr sources on your local disk and save them into the SOLR-NNN.patch file. Read the patch file. Make sure it includes ONLY the modifications required to fix a single issue.

Note the SOLR-NNN.patch patch file name. Please use this naming pattern when creating patches for uploading to JIRA. Once you create a new JIRA issue, note its name and use that name when naming your patch file. For example, if you are creating a patch for a JIRA issue named SOLR-123, then name your patch filename SOLR-123.patch. If you are creating a new version of an existing patch, use the existing patch's file name. JIRA will automatically "gray out" the old patch and clearly mark your newly uploaded patch as the latest.

Please do not:

Please do:

Contributing your work

Finally, patches should be attached to a bug report in Jira. If you are revising an existing patch, please re-use the exact same name as the previous attachment, Jira will "gray out" the older versions so it's clear which version is the newest.

Please be patient. Committers are busy people too. If no one responds to your patch after a few days, please make friendly reminders. Please incorporate other's suggestions into into your patch if you think they're reasonable. Finally, remember that even a patch that is not committed is useful to the community.

JIRA tips (our issue/bug tracker)

The issue tracker we use is a JIRA instance at https://issues.apache.org/jira/browse/SOLR

Review/Improve Existing Patches

If there's a Jira issue that already has a patch you think is really good, and works well for you -- please add a comment saying so. If there's room for improvement (more tests, better javadocs, etc...) then make the changes and attach it as well. If a lot of people review a patch and give it a thumbs up, that's a good sign for committers when deciding if it's worth spending time on the patch -- and if other people have already put in effort to improve the docs/tests for a patch, that helps even more.

Working With Patches

You can easily download a patch from JIRA and test it by doing the following:

$ cd <your Solr trunk checkout dir>
$ svn up
$ wget <URL of the patch>
$ patch -p0 -i name of the patch --dry-run

(note: --dry-run just pretends to apply a patch, so you can see if it would succeed or fail. Remove --dry-run to *really* apply the patch)

The address for the patch can be obtained from the issue page, under the "File Attachments" section of the issue.

For people who like one-liners, The following should work as well:

$ cd <your Solr trunk checkout dir>
$ svn up
$ wget <URL to the patch> -O - | patch -p0 --dry-run

If you are on Solaris, you should replace 'patch' with 'gpatch' to use GNU Patch instead.

Reverting to pre-patch state is one line:

svn revert -R .

Though this leaves added files, which can be removed with

svn st | grep '?' | awk '{print $2}' | xargs rm

Another useful trick is to have multiple checkouts of trunk and "bounce" an active changeset from one to another with

svn diff | (cd ../otherbranch; patch -p0)

Helpful Resources

The following resources may prove helpful when developing Solr contributions. (These are not an endorsement of any specific development tools)

<!> Solr1.3 If you are using eclipse to follow trunk (leading up to the 1.3 release) eclipse will give several errors about not resolving components in the solrj library. This will appear in the org.apache.solr.handler.component package relating to distributed search (sharedrequest.java ...etc) The solution is to compile the solrj library via the dist-solrj target and add them to your eclipse build path. After running the dist-solrj target look in dist/solrj-lib and add apache-solr-solrj-1.3-dev.jar and commons-httpclient-3.1.jar to your buildpath.

Development Environment Tips

There was a recent thread concerning trying to set up Lucene and SOLR in Eclipse. Here is a guide for setting up Eclipse and IntelliJ dev environments:

Follow the instructions above to fetch the combined Lucene and SOLR trunk. For the remainder of this document, the installation path is assumed to be ~/apache/trunk/lucene and ~/apache/trunk/solr. NOTE: I'm installing on a Macbook, so this is a *nix style file system etc. These instructions should work for windows as well, but if you try to use them in that environment, feel free to update this page with anything you uncover.

Before fiddling with the IDE, I'd strongly recommend you get the tests to run from the shell. This will insure that your machine has the proper setup for the IDEs to magically find what's necessary. See the instructions above. Hint: Issue 'ant clean test' in the SOLR and Lucene directories and look for "BUILD SUCCESSFUL" minutes later.

Setting things up is actually very smooth when it's smooth, especially if the tests have run <G>.

Eclipse (Galileo, J2EE version 1.2.2.20100217-2310, but any relatively recent Eclipse should do):

This is easy since Paolo Castagna did the hard work and then posted two files you'll need, .classpath and .project for both Lucene and SOLR. Note: These are *not* currently checked in to SVN, they are attached to this page. You might find yourself asking the question "where did these files unzip to?" Since they start with a dot (.), the OS X Finder doesn't show them by default. You can do an "ls -a" in a terminal window and they'll show up. Something similar may occur in Windows. Here are Paolo's files:

Put the respective .classpath and .profile files in ~/apache/trunk/lucene and ~/apache/trunk/solr. Now fire up Eclipse and just select "File>>New>>Java Project". Click "Create project from existing sources". Browse to ~/apache/trunk/lucene (this should be whats in the "directory" textbox). Now just click through to "finish", accepting the defaults. Eclipse will chew away on this for a while.

Now you should be able to navigate into your project. ctrl-click (or right click) on one of the test case, select "Run as>>Junit test" and things should "just work". If not, let's chat and update this page.

Do the same thing to create a new Java project for ~/apache/trunk/solr.

DO NOT BE SURPRISED IF SOME TESTS FAIL IN THE IDE. There are some anomalies when running Junit tests for these projects in an IDE. Some of them are already cleaned up, but others may still fail when run in an IDE. The definitive case for whether a test fails or not is running it as an Ant task.

Enabling Assertions for unit tests in Eclipse

By default, Eclipse does not run test with assertions enabled. This causes some tests to run incorrectly in Eclipse and excludes some checks in the source code.

Change this by checking the box "Add '-ea' to VM arguments when creating a new JUnit launch configuration".

This checkbox is available under Windows>Preferences>Java>JUnit

Installing the code style file

Lucene and SOLR have a common code style preferences. Install one in your Eclipse and set it as the default for the project. Do this by:

You should now be able to click the "import" button, and import the codestyle file you downloaded. Eclipse doesn't immediately show that the selected import is the new code style, but closing the dialog boxes and coming back to the formatter page should allow you to choose it.

Tips:

Running SOLR in Eclipse: See: http://www.lucidimagination.com/developers/articles/setting-up-apache-solr-in-eclipse.

IntelliJ (Maia-IU-95.24)

Cheap, easy way to get basic functionality running

In a word, cheat. Follow the instructions above to get the source code AND copy the Eclipse .project and .classpath files to the Lucene and SOLR directories. Yes, the Eclipse ones. But you do NOT need to open Eclipse or make any Eclipse projects, just have the .project and .classpath files in the right place.

Fire up IntelliJ and click "create new project>>import project from external model (Eclipse)" and click some "next" buttons until the new projects dialog appears. In the "Select Eclipse directory" textbox, navigate to ~/apache/trunk/lucene". I just let the defaults stay "Create module near .classpath files" Project File Format is ".idea (directory based). Now click through until you get to a "finish" button and wait a bit.

You should now be able to run tests by finding a test file, crtl-click (right click) on a test file and "run". IntelliJ seems to figure out that it's a junit test.

Note: Out of the box, testPorterStemFilter fails with the error "cannot find porterTestData.zip". Apparently, the test that calls for this file is looking for org/apache/lucne/analysis/porterTestData.zip and can't find it. You can solve this (and, I assume related issues) by specifying a "library" and clicking the "attach jar directories", then point it at ~/apache/trunk/lucene/src/test". Now make your module rely on the new library.

Then I did the same thing for SOLR, *except* I created a new "module" rather than "project". The first JUnit test I ran worked perfectly.

IntelliJ seems to have Subversion support built-in, no real need to install a plugin.

I haven't tried to set up SOLR to run from within IntelliJ yet, if anyone has feel free to add the instructions.

More flexible setup without using Eclipse .project files

This turns out not to be too bad, but tedious. Cheating by using the Eclipse files kinda works, but then you're stuck with all of the contrib packages, etc. Here's a general outline. I'll expand it if the dev list (which I (Erick Erickson) monitor) shows that there's enough interest...

Instead of an "all at once" setup, I created a series of modules, for instance "lucene", "lucene_test" etc. You can group them any way you want (IntelliJ has a "Move Module to Group" option). So my structure looks like this:

 Lucene
 Lucene_test
 contrib (group)
   highlight
   memory
   queryparser

After you've gotten the source code, you can set up various modules and create groups, moving things around by the context-menu choices. By and large, the trick is to then make each module dependent on other modules as needed, as well as on the main Lucene library in ~/apache/trunk/lucene/lib that contains the jars for junit, ant, etc.

So, first "create new module from existing source" for, say, Lucene. In this setup, I created it from the source at ~/apache/trunk/lucene/src/java This gives the starter kit. Then I created the "Lucene_Test" module, but now you have no choice to "create from source", instead you just choose "create module from scratch", but then navigate to the source at ~/apache/trunk/lucene/src/test. Set up a library to point to ~/apache/trunk/lucene/lib. Make both the modules depend on this library, and additionally make the Lucene_Test module depend on the Lucene module (see the "module settings>>dependencies" tab). You should now be able to get the test cases to run. See the note above about how to find resources, e.g. "porterTestData.zip". I haven't run all the tests from inside IntelliJ yet, but this should start things rolling.

That's about it for Lucene itself, but for the contrib modules it's almost as straight-forward. Go through the same "create module from scratch" and optionally move the new module to a group or sub-group. You can create a new module from the source at, say, ~/apache/trunk/lucene/contrib/queryparser. Resolve as above by making this module dependent upon other modules (in this case, Lucene and Lucene_test), and the "project library" at ~/apache/trunk/lucene/lib (note, you should have created this when you initially set up the original Lucene and Lucene_test modules).

NOTE: some contrib modules depend upon other contrib modules, so you may have to create several contrib modules to resolve all of these. Highlighter is a case in point.

Test should now run for the contrib module. One quirk I'm seeing is that bringing up the context menu for a contrib test directory wants to run the Lucene_Test tests. But going one level down and bringing up the context menu for, say, org.apache.lucene.queryParser runs the tests in that module.

As usual, the first time one does all this it's mysterious and often requires several false starts, but pretty soon it takes "just a few clicks of the mouse".

Installing code style

To install the Lucene/SOLR codestyle files, get the IntelliJ codestyle file from this site and put it in the magic place so IntelliJ can find it. On my Mac that is in ~/Library/Preferences/IntelliJ90/codestyles and restart IntelliJ.

Now click on the "Settings" icon (the little in the toolbar) and click "codestyle". You should see the new code style configuration in the select box. NOTE: the name in the select box is the name from the <code_scheme....> tag in the xml file. It is NOT the name you put on the file, which can be a bit confusing..

IntelliJ also allows you to create patches very easily...

One final note

As always, there are gremlins out there. This guide works for me on my machine for both Eclipse and IntelliJ. However, this isn't the first project I've put in either of those environments here. Your machine with your history may have different results. If there are steps you have to take, please either let me know or update this page directly so others can benefit.