Invalid JIRA Issues

This page tries to explain why some issues on the Apache JIRA get closed as 'invalid'.

The Apache JIRA server is used for two things

  1. discussing and co-ordinating feature development of Apache Hadoop. We welcome people who want to get involved with this.
  2. reporting and fixing bugs in the code

What it is not is a way of people reporting their "I couldn't get Hadoop to work" problems.

Given that Hadoop is used on thousands of machines by companies like Yahoo!, Facebook and eBay, we are reasonably confident that Hadoop works on:

If Hadoop does not work for you, then these are the likely problems -your problems related to local configurations --especially your network

These are not bugs in Hadoop -please do not file bugs on JIRA about them

Bug reports of the form "I can't get Hadoop to work", are going to be closed as invalid, unless there is clear evidence that the problem exists in an Apache release.

Which raises another issue. JIRAs cannot be filed against Big Data Stacks that aren't bundling the Apache releases of Hadoop artifacts. We can't, because we don't all track what those changes are.

Here's a video on how to file good and bag bugs:

Please look at the video and understand why your JIRA was closed with a reference to this page. Then follow some of the suggestions below to help debug your cluster.

Read and Understand the Logs

Hadoop, Java Build tools and the operating system all log messages somewhere: to screen, to Hadoop service logs, to the OS logs. Learn to read these, rather than just posting them to the user lists and forums and asking for help.

Finding an answer by searching for it on the web is the fastest way to get help -and log messages are ideal for searching on

Ask on the User Mailing Lists

Ask on Vendor Forums and Support Channels

If you are not using out-the-box Apache Hadoop, but instead a commercial Bug Data Stack, their support process should be your starting point

Read the source, books and online articles

The source is ideal when you are really trying to understand the logs. Some IDEs (example: IntelliJ IDEA) will take a stack trace and work out the source tree, and you can search for all or part of an error string to find out its origin too. Debugging your own problems is a pragmatic way to learn your way round that source tree -just make sure you have the exact version of the source that you are running, so the stack traces match your source.

Keep your version of Hadoop current

Finally: the development and testing goes on Hadoop 2.2+, with some maintenance of branch 1, with Hadoop 1.2.1 being the latest (as of December 2013). If you have a problem with an older version of Hadoop: upgrade. If you aren't prepared to upgrade, you can't expect any help at all.

Returning to JIRA, it may seem unfair for the developers not to care about your "critical" issue and close it as invalid, despite the fact they are clearly the experts in Hadoop internals. However they -we- are busy trying to build the future of Hadoop, the operating system for data. Most of the people working on this are being paid to do so, either from companies whose business is built around selling supported Hadoop-based products, or from people who use in production internally. None of these people have the time to help you -because if they did help everyone with a problem, they'd never get anything done.

Those developers who are working full time for downstream redistributors of Hadoop works are being paid through support revenue -and their companies have support teams who will help -as can others on the Distributions and Commercial Support page. Those developer using Hadoop on internal projects probably get to field lots of internal support calls -which keeps them busy enough.

To summarise then: your issue was closed as JIRA is not the place to ask for help.

Do

  1. Use a recent release of Hadoop. Older versions will have old bugs.
  2. Read the error message and try to understand what it means.
  3. Search on the web for the error message -and see what others did when they encountered it.
  4. Ask on the Hadoop User lists and any vendor-specific forums or other support options they offer.
  5. Join or create a local Hadoop User Group -to find nearby people to learn from and solve problems with.
  6. Read how to ask good smart questions before asking bad ones.

Don't.

  1. File JIRA issues on problems you have trying to get your own code to compile or run.
  2. File JIRA issues on problems you have starting up your cluster, unless someone on the -user lists says "this is really a bug".
  3. File JIRA issues on problems you have seen on outdated versions of Hadoop -update and try to replicate first.
  4. File JIRA issues on problems you have with Apache Hadoop based products provided by third parties, unless these products are actually using the apache artifacts. Try to replicate on the ASF versions first.
  5. Ask questions about using Hadoop on the developer lists. You will be deliberately ignored.

That's why your JIRA issue was closed. It's not that the developers don't care that you can't get Hadoop to work -it's that they aren't the right people to ask.

Sorry.

InvalidJiraIssues (last edited 2014-07-09 08:53:59 by SteveLoughran)