How to develop Hadoop Tests

This page contains Hadoop testing and test development guidelines.

Cheat sheet of tests development for JUnit v4

Hadoop has been using JUnit4 for a while now, however it seems that many new tests are still being developed for JUnit v3. It is partially JUnit's fault because for the false sense of backward compatibility all v3 junit.framework classes are packaged along with v4 classes and it all is called junit-4.5.jar. This is necessary to permit mixing of the old and new tests, and to allow the new v4 tests to run under the existing JUnit test runners in IDEs and build tools.

Here's the short list of traps one need to be aware and not to develop yet another JUnit v3 test case

Other Hadoop Test case requirements

Assertions

Because your test asserts your will be using need to be statically imported either one by one, i.e.

import static org.junit.Assert.assertTrue;

or all of them at once

import static org.junit.Assert.*;

It is also possible to cheat and extend the Assert class itself

import org.junit.Assert;

public class TestSomething extends Assert {
}

The final tactic is half-way between JUnit 3.x and the JUnit 4 styles; the Hadoop team is yet to come down against it, though they reserve the right.

Effective Assertions

  1. Use the JUnit assertions, not the Java assert statement.

  2. In equality tests, place the expected value first
  3. Give assertions meaningful error messages.

Bad

/** a test */
@Test
public void testBuildVersion() {
  Namenode nn = getNameNode(); 
  assertNotNull(nn);
  NamespaceInfo info = nn.versionRequest() ;
  assertEquals(info.getBuildVersion(),"32");
}

This test doesn't include any details as to why a test fails, so if you do a test run you find out the name and the line of a test and are left looking that up in the source to work out what went wrong. Some explanations help. The assertEquals() test will have some meaningful message, but because the variable comes before the constant, the text will be wrong.

Good

/**
 * Test that the build version is OK
 */
@Test
public void testBuildVersion() {
  Namenode nn = getNameNode(); 
  assertNotNull("No namenode", nn);
  NamespaceInfo info = nn.versionRequest() ;
  assertEquals("Build version wrong", "32", info.getBuildVersion());
}

When any of the equals assertions fail, the error text includes the text inserted in the assertion, and the expected and equals values. You don't need to explicitly include them. For all assertions, providing hints as to what is wrong is good.

Logging

All Hadoop test cases run on a classpath which contains commons-logging; use the logging APIs just as you would in Hadoop's own codebase

import org.apache.commons.logging.Log;
import org.apache.commons.logging.LogFactory;
import org.junit.Test;

public class TestSomething {
  private static final Log LOG =
    LogFactory.getLog(TestSomething.class);
}

Don't go overboard in logging at info level, as it can be buffered in the test runners (especially the XML one) and lead to out of memory problems. Log the details at debug level which can then be turned on for specific tests causing problems.

Exception Handling in Tests

Test methods can declare that they throw Throwable. There is no need to catch exceptions and wrap them in JUnit RuntimeException instances.

Bad

@Test
public void testCode() {
  try {
    doSomethingThatFails();
  catch(IOException ioe) {
   fail("something went wrong");
  }
}

good

@Test
public void testCode() throws Throwable {
  doSomethingThatFails();
}

This leaves less code around (lower maintenance costs), and ensures that any failure gets reported with a full stack trace.

Let Unit Tests be "Unit" tests

Avoid starting servers (including Jetty and Mini{DFS|MR}Clusters) in unit tests, as they take tens of seconds to start for each test (HDFS and MapReduce tests already take many hours mostly due to these servers starts). Use them only in cross component functional or integration (system) tests (cf. HADOOP-6399). Try to use one of the lighter weight test doubles for collaborating components for the component under test. Hadoop has adopted the Mockito library for easy mock and stub creation.

References

HowToDevelopUnitTests (last edited 2013-02-24 22:58:22 by KonstantinIBoudnik)