How to Contribute to Solr
"Contributing" to an Apache project is about more then just writing code -- it's about doing what you can to make the project better. There are lots of ways to contribute....
- How to Contribute to Solr
- Be Involved
- Be A Mailing List Moderator
- Write/Improve User Documentation
Contributing Code (Features, Bug Fixes, Tests, etc...)
- Getting the source code
- Exporting to a local repository
- Making Changes
- Notes for Eclipse
- Generating a patch
- Contributing your work
- Commit process using Git
- JIRA tips (our issue/bug tracker)
- Review/Improve Existing Patches
- Helpful Resources
- Development Environment Tips
- Getting your feet wet: where to begin?
- Parsers and JFlex
- One final note
Contributors should join the Solr mailing lists. In particular:
- the user list (to help others)
- The commit list (to see changes as they are made)
- The dev list (to join discussions of changes)
Please keep discussions about Solr on list so that everyone benefits. Emailing individual committers with questions about specific Solr issues is discouraged. See http://people.apache.org/~hossman/#private_q.
Be A Mailing List Moderator
Being a list moderator is incredibly easy - the basic responsibilities are:
- Get a copy of any email sent to one of the Lucene lists from an address that is not subscribed and review it to see if it's spam or not
- Occasionally help people with particular difficulties unsubscribing to the mailing list.
If you'd like to volunteer to be the moderator of a mailing list, just contact listname-owner@lucene... (ie: solr-user-owner@lucene...)
Write/Improve User Documentation
Solr can always use more/better documentation targeted at end users. The Reference Guide is the official documentation, and only committers can modify it, but anyone can comment on it. There is quite a lot of additional documentation in this wiki, which anyone can edit after creating an account and asking for edit permissions via the solr-user mailing list or the #solr IRC channel on freenode.
If you see a gap or problem in the Reference Guide, comment on the page to bring it to the attention of a committer. If you see a gap or problem in the documentation on this wiki, fill it in. Even if you don't know exactly what to say, ask on the user list and you'll probably get a lot of great responses -- talking informally about how Solr works is something lots of people tend to have time for, but aggregating all of that info into concise cohesive documentation takes a little more work and patience.
If there is a patch in Jira that you think is really great, writing some "user guide" style docs about how it works (or is suppose to work) in this wiki is a great way to help the patch get committed: It helps serve as a road map for what the "goal" of the issue is, what should be possible for users to do once the issue is resolved; it helps get people who may not understand the low level details get excited about the new functionality; and it can eventually evolve into the final documentation once the code is committed. (just make sure to link to the issue so people who find your wiki page first know it's not included in Solr's main code line yet).
Contributing Code (Features, Bug Fixes, Tests, etc...)
This section identifies the optimal steps community member can take to submit a changes or additions to the Solr code base. This can be new features, bug fixes optimizations of existing features, or tests of existing code to prove it works as advertised (and to make it more robust against possible future changes).
Please note that these are the "optimal" steps, and community members that don't have the time or resources to do everything outlined on this below should not be discouraged from submitting their ideas "as is" per "Yonik's Law of Patches" ...
A half-baked patch in Jira, with no documentation, no tests and no backwards compatibility is better than no patch at all.
Just because you may not have the time to write unit tests, or cleanup backwards compatibility issues, or add documentation, doesn't mean other people don't. Putting your patch out there allows other people to try it and possibly improve it.
Getting the source code
First of all, you need the Solr source code.
Get the source code on your local drive using http://lucene.apache.org/solr/resources.html#solr-version-control.
To check out code from GIT
for non-committers: git clone http://git-wip-us.apache.org/repos/asf/lucene-solr.git for committers git clone https://git-wip-us.apache.org/repos/asf/lucene-solr.git
The <branch> part of the command above needs to be replaced by something concrete - the "code line" you want to get. Examples, and how to interpret what they mean:
- master: Working towards the eventual 6.0 release. This is the main center for development, it's not really a branch. Releases are never made from master, they are only made from the stable development branch.
- branch_5x: The current stable development branch for the next stable 5.x version.
- releases/lucene-solr/4.10.3: When a new version is fully released, the lucene_solr_x_x branch is copied to a tag which represents the source code for that specific release.
- history/branches/lucene-solr/lucene_solr_4_10: When branch_4x was removed and 4.x moved to maintenance mode, all 4.x development moved to this branch. There is never a guarantee that a point release will ever be made after the x.x.0 version is released.
- history/branches/lucene-solr/lucene_solr_3_6: When branch_3x was removed and 3.x moved to maintenance mode, all 3.x development moved to this branch.
Git SHA can be used to obtain branches and tags from their parents. For example, revision 1394844 on history/branches/lucene_solr_4_0 corresponds to the official 4.0.0 release, so if you add "-r 1394844" to the command line above using branches/lucene_solr_4_0 for <branch>, you will get the code corresponding to 4.0.0. You can also just do a checkout of releases/lucene-solr/4.0.0 to get the same code.
Most development is done on "master" and then backported to the other active branches. At some point in the future, a branch_6x will be created and master will be updated to reflect a 7.0 version.
Note that committers have to use https instead of http here, but http is fine for read-only access to the code.
Working with GitHub
If you prefer you could use the GitHub mirror instead. Note that the drop-down lets you select "master" or "branch_5x" (among others). We accept GitHub Pull Requests (PR). To submit one, first fork the project, then make changes in a feature branch which you push to GitHub and use as the basis for the PR.
Note about old forks: If you had an existing GitHub fork from before January 23rd 2016, then it will no longer list as "forked from" the official apache/lucene-solr project, meaning that PRs will not go to the Lucene project. Technically this is because the old apache/lucene-solr GitHub repo was delete and then re-added during the git switch, causing all forks to choose another "parent" project. The only fix for this is to delete your fork on the GitHub website and re-fork from the correct project.
If you had current work in progress in your local fork, please create patches for them and re-apply on top of the new fork.
If you had an open PR in the old GitHub repo, these are gone from the GitHub site. Please submit a new PR against the new repo. If you should want to see information in one of your old PRs, you may send us an email, as we have an export of all old PRs.
Building the first time
cd your_checkout_dir/solr Issue any of the following commands. 'ant server' will build a runnable Solr (note, 'ant example' is the target pre 5.x) 'ant dist' will build the libraries you might want to link against from, say, a SolrJ program. 'ant package' will build files like 'package/solr-5.2.0-SNAPSHOT-src.tgz' which are standard distributions you can install somewhere else just like an official release. 'ant test' will run all of the unit tests. NOTE: this takes quite a while.
Exporting to a local repository
When making larger scale (please, try to keep your patches as small as humanly possible) changes, or when working in a team, you might want to be able to keep track of what you are doing locally. One way of doing so is to clone the lucene-solr repository and create a local branch to which you can commit all your work. Another way is using Github to create a fork of apache/lucene-solr and use that to commit code either individually or as a team. Once you are ready, you can send a pull request to lucene-solr after creating an issue on the lucene-solr Jira and mentioning the jira issue number in the pull request.
Before you start, you should send a message to the Solr developer mailing list (Note: you have to subscribe before you can post), or file a bug in Jira. Describe your proposed changes and check that they fit in with what others are doing and have planned for the project. Be patient, it may take folks a while to understand your requirements.
Modify the source code and add some (very) nice features using your favorite IDE.
But take care about the following points
All public classes and methods should have informative Javadoc comments.
Code should be formatted according to Sun's conventions, with some exceptions:
- indent two spaces per level, not four.
lines can be greater than 80 chars long, 132 is a common limit. Try to be reasonable for very long lines.
- Contributions should pass existing unit tests.
New unit tests should be provided to demonstrate bugs and fixes (http://www.junit.org).
Notes for Eclipse
To get correct classpath, formatting, encoding, and project settings in Eclipse, simply run ant eclipse and then reload the project in Eclipse. Be sure that you are using an appropriate version of the Java JDK. For branch_5x, the minimum version is Java 7 (1.7.x), for trunk (working towards 6.0), the minimum version is Java 8 (1.8.x).
Generating a patch
A "patch file" is the format that all good contributions come in. It bundles up everything that is being added, removed, or changed in your contribution.
Please make sure that all unit tests succeed before constructing your patch.
> cd solr-trunk > ant clean test
After a while, if you see
all is ok. but if you see
please, read carefully the errors messages and check your code. If the test fails you may want to repeatedly rerun a single test as you debug and sort out any problems. In which case you could run
> ant -Dtestcase=TestXXX test
Where "TestXXX" is the name of the particular Junit test you want to run. NOTE: specifying -Dtestcase= must be done at least at the <home>/lucene or <home>/solr level. Executing this at the root level will not run tests.
Frequently failing Tests
There are some tests that fail sometimes on some systems, but run on Jenkins fine. It's always a good idea to be sure you can run "ant test" successfully before you start making code changes. Or keep an un-changed version of the code around to see if your changes are really to blame.
One of the great things about Open Source is so many people run the tests on so many different systems. Occasionally you'll be the lucky person who has the system that wins the prize by having the environment that exposes a new failure mode, see the discussion at SOLR-3846 for an example.
If you do find one of these, here's what you should do:
- First, just try running 'ant -Dtests.badapples=false test'. If the tests succeed, this is a known issue that we haven't found a solution for yet, but want to gather information about. If you have the time, ping the dev list and include your setup details. But in the case where setting this flag causes the tests to succeed, you can assume that your code changes didn't cause this error and go ahead.
- If tests continue to fail, ask on the dev list if anyone else has seen the issue. This is the case where having the un-changed code helps. If the tests fail on both the changed and un-changed versions, discuss on the dev list whether the test can be annotated as a 'badapples' test or not.
- If tests fail with your changes but not on un-altered code, well, you need to understand why. By far the most frequent reason is a bug or unintended side-effect of your code, but occasionally it's a test that needs modification. Again, the dev list is a good place to discuss this.
- Be very cautious about adding anything like @Ignore to a test. This is generally the wrong thing to do unless you get some consensus, and it'll surely generate "spirited debate".
- Of course any effort you want to make towards tracking down the reason a test fails in your particular environment is greatly appreciated!
Before constructing your patch, please run the top-level pre-commit check, which finds problems like tabs and @author tags in source files, broken links in javadocs, files not controlled by Git (a.k.a. "unversioned files"), etc.
To run the pre-commit checks from ant, run the following from the top-level directory -- the directory containing lucene/ and solr/ -- in your working copy:
Creating the patch file
Check to see what files you have modified with:
Add any new files with:
git add src/.../MyNewClass.java
Git's "add" command only modifies your local copy, so it does not require commit permissions. By using "git add", your entire contribution can be included in a single patch file, without needing to submit a separate set of "new" files.
If you have a lot of new files you can do "git add -A" for all new files in a single command (on "real" OS's only)
Edit the CHANGES.txt file, adding a description of your change, including the bug number it fixes.
In order to create a patch, just type:
git format-patch origin/master > SOLR-NNNN.patch
This will report all modifications done on Solr sources on your local disk and save them into the SOLR-NNN.patch file. Read the patch file. Make sure it includes ONLY the modifications required to fix a single issue.
Note the SOLR-NNN.patch patch file name. Please use this naming pattern when creating patches for uploading to JIRA. Once you create a new JIRA issue, note its name and use that name when naming your patch file. For example, if you are creating a patch for a JIRA issue named SOLR-123, then name your patch filename SOLR-123.patch. If you are creating a new version of an existing patch, use the existing patch's file name.
Please do not:
- reformat code unrelated to the bug being fixed: formatting changes should be separate patches/commits.
- comment out code that is now obsolete: just remove it.
- insert comments around each change, marking the change: folks can use git to figure out what's changed and by whom.
- make things public which are not required by end users.
- Combine multiple issues into a single patch, especially if they are unrelated or only loosely related. This is true even if the changes affect the same files. In some rare cases it is warranted, but for the most part it makes it harder for committers to evaluate the patch.
- try to adhere to the coding style of files you edit;
- comment code whose function or rationale is not obvious;
update documentation (e.g., package.html files, this wiki, etc.)
- try to provide a unit test that shows a bug was indeed fixed or the new functionality truly works
Contributing your work
Finally, patches should be attached to a bug report in Jira. If you are revising an existing patch, please re-use the exact same name as the previous attachment. JIRA will automatically "gray out" the old patch and clearly mark your newly uploaded patch file as the latest (it'll be the colored link). You'll see multiple copies of your patch if you've named them identically, and this is preferred as it preserves the history of the patch which can come in handy. Since the most recent one is the only one not gray, we always know which one to use.
Please be patient. Committers are busy people too. If no one responds to your patch after a few days, please make friendly reminders. Please incorporate others suggestions into your patch if you think they're reasonable. Remember that even a patch that is not committed is useful to the community. Supply first patch as early as possible and updated patches as often as possible during your work. This helps the rest of the community and committers to start understanding, help shaping, commenting on etc. your work throughout the entire process. Supplying a patch does not necessarily mean that it is complete and ready to be committed, it might also just be a way of communicating your idea and progress.
Commit process using Git
For committers Only committers can put the code in the Git repository. We're still (as of Feb, 2016) in the transition period from SVN to Git and are working out the preferred process.
See Git commit process for the current recommendations.
JIRA tips (our issue/bug tracker)
The issue tracker we use is a JIRA instance at https://issues.apache.org/jira/browse/SOLR. If you don't yet have an account, just click the "login" link and you'll get the opportunity to create one that will allow you to add tickets, upload patches etc.
- When creating new issues in JIRA, please keep the "Description" field short - every change or followup on the issue will cause an email to be sent to the solr-dev mailing list, and will include the complete Description every time.
- When attaching newer versions of a file/patch, use the same name... JIRA will "gray out" the older versions automatically.
- Please do not delete older files that you have already added - the complete history of an issue is important.
- If you aren't sure if something is a bug, please ask on the solr-user mailing list before opening an issue.
- The "Activity" section of an issue by default only lists "Comments". If you click on the "All" subtab you can see all activity related to this issue, including any edits that might have been made to the summary or description, as well as an commits that mention this issue in the commit log.
Review/Improve Existing Patches
If there's a Jira issue that already has a patch you think is really good, and works well for you -- please add a comment saying so. If there's room for improvement (more tests, better javadocs, etc...) then make the changes and attach it as well. If a lot of people review a patch and give it a thumbs up, that's a good sign for committers when deciding if it's worth spending time on the patch -- and if other people have already put in effort to improve the docs/tests for a patch, that helps even more.
Working With Patches
You can easily download a patch from JIRA and test it by doing the following:
$ cd <your Solr trunk checkout dir> $ git pull $ wget <URL of the patch> $ patch -p1 -i name of the patch --dry-run
- --dry-run just pretends to apply a patch, so you can see if it would succeed or fail. Remove --dry-run to *really* apply the patch -p1 may need to be -p0 for svn-generated patches that should be rare going forward.)
The address for the patch can be obtained from the issue page, under the "File Attachments" section of the issue.
For people who like one-liners, The following should work as well:
$ cd <your Solr trunk checkout dir> $ git pull $ wget <URL to the patch> -O - | patch -p1 --dry-run
If you are on Solaris, you should replace 'patch' with 'gpatch' to use GNU Patch instead.
If the patch is created using Git, it has another format, which can be applied using -p1:
patch -p1 -i name of the git formatted patch --dry-run
Reverting to pre-patch state is one line:
git reset --hard
Though this leaves added files, which must be removed
The following resources may prove helpful when developing Solr contributions. (These are not an endorsement of any specific development tools, but Eclipse and IntelliJ seem to be the most popular)
IntelliJ (IntelliJ codestyle) NOTE: there is no need to install this file separately if you execute "ant idea", it is done for you. If you do install it, you'll have to tweak the name to select in in IntelliJ as it's currently anonymous (all the more reason to use the ant target, see the IntelliJ instructions).
Solr1.3 If you are using eclipse to follow trunk (leading up to the 1.3 release) eclipse will give several errors about not resolving components in the solrj library. This will appear in the org.apache.solr.handler.component package relating to distributed search (sharedrequest.java ...etc) The solution is to compile the solrj library via the dist-solrj target and add them to your eclipse build path. After running the dist-solrj target look in dist/solrj-lib and add apache-solr-solrj-1.3-dev.jar and commons-httpclient-3.1.jar to your buildpath.
Development Environment Tips
Here is a guide for setting up Eclipse, IntelliJ and Netbeans dev environments:
Follow the instructions above to fetch the combined Lucene and SOLR trunk. For the remainder of this document, the installation path is assumed to be ~/apache/trunk/lucene and ~/apache/trunk/solr. NOTE: I'm installing on a Macbook, so this is a *nix style file system etc. These instructions should work for windows as well, but if you try to use them in that environment, feel free to update this page with anything you uncover.
Before fiddling with the IDE, I'd strongly recommend you get the tests to run from the shell. This will insure that your machine has the proper setup for the IDEs to magically find what's necessary. See the instructions above. Hint: Issue 'ant clean test' in the SOLR and Lucene directories and look for "BUILD SUCCESSFUL" minutes later.
Setting things up is actually very smooth when it's smooth, especially if the tests have run <G>.
DO NOT BE SURPRISED IF SOME TESTS FAIL IN THE IDE. There are some anomalies when running Junit tests for these projects in an IDE. Some of them are already cleaned up, but others may still fail when run in an IDE. The definitive case for whether a test fails or not is running it as an Ant task.
See the Lucene wiki page on configuring IntelliJ - it also covers Solr configuration.
There is information about using Maven with Solr and Lucene in the source tree, at dev-tools/maven/README.maven. The information differs slightly by code branch. Here is a link to this file for the trunk version (unreleased; will be released as version 6.0 as of this writing): https://git1-us-west.apache.org/repos/asf?p=lucene-solr.git;a=blob;f=dev-tools/maven/README.maven;h=390177172cd2fee2639e83d2cf4900bea1e8d05f;hb=refs/heads/master
Getting your feet wet: where to begin?
New to Solr? Want to find JIRA issues that you can work on without taking on the whole world?
The Solr/Lucene developers use the "newdev" label to mark issues that developers new to Solr might be interested in working on. The rough criteria used to make this selection are:
- Nobody has done any work on the issue yet.
- The issue is likely not controversial.
- The issue is likely self-contained with limited scope.
To see a list of open Solr and Lucene issues with the newdev label, look at this link http://s.apache.org/newdevlucenesolr
Note: Fixing these issues may require asking questions on the developer list to figure out what they mean - there is no guarantee that any of these will be either quick or easy.
Parsers and JFlex
YOU PROBABLY DON'T HAVE TO DO THIS but if you do for some reason want to regenerate the .java from the .jflex files, here's the cheat-sheet. This is a separate step that people who actually like parsers sometimes have to execute to reflect changes in various grammars.
Check out the most recent jflex branch. This is important since what Lucene uses is based on trunk rather than a released version (thanks for tutoring me Uwe!). Do this by executing: svn co https://jflex.svn.sourceforge.net/svnroot/jflex/trunk jflex
- cd jflex
- mvn install (NOTE: I've seen the tests fail in this step, doesn't seem to matter)
export ANT_OPTS="-Xmx1G -XX:+CMSClassUnloadingEnabled -XX:MaxPermSize=1G" (1)
- ant jflex
(1) I was getting OOM errors when running "ant jflex", and got a tip to do this, which made that problem go away. You may not have to be so generous with memory, but this worked... (Thanks Steve!).
One final note
As always, there are gremlins out there. If you find problems with the information here, and especially if you subsequently find solutions to the problems you find, please either write to the solr-user mailing list or update this page directly so others can benefit.