Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Table of Contents

Overview

TODO:

In Apache Hama, you can implement your own BSP method by extending from org.apache.hama.bsp.BSP class. Apache Hama provides in this class a user-defined function bsp() that can be used to write your own BSP program.

The bsp() function handles whole parallel part of the program. (So it just gets called once, not all over again)

There are also setup() and cleanup() which will be called at the beginning of your computation, respectively at the end of the computation.

cleanup() is guranteed to run after the computation or in case of failure.

You can simply override the functions you need from BSP class.

The Hama is based on the Bulk Synchronous Parallel Model (BSP), in which a computation involves a number of supersteps, each having many distributed computational peers that synchronize at the end of the superstep. Basically, a BSP program consists of a sequence of supersteps. Each superstep consistsof consists of the following three phases:

  • Local computation
  • Process communication
  • Barrier synchronization

NOTE that these phases should be always sequential order.

In Apache Hama, the communication between tasks (or peers) is done within the barrier synchronization.

BSP Function

Hama provides a user-defined function “bsp()” that can be used to write your own BSP program. The bsp() function handles the whole parallel part of the program. It only takes one argument "BSPPeer", which contains an communication, counters, and IO interfaces.

...

No Format
  public static class SumCombiner extends Combiner {

    @Override
    public BSPMessageBundle combine(Iterable<BSPMessage> messages) {
      BSPMessageBundle bundle = new BSPMessageBundle();
      int sum = 0;

      Iterator<BSPMessage> it = messages.iterator();
      while (it.hasNext()) {
        sum += ((IntegerMessage) it.next()).getData();
      }

      bundle.addMessage(new IntegerMessage("Sum", sum));
      return bundle;
    }

  }

Job Configuration and Submission

...

TODO:

— General Information —

In Apache Hama, you can implement your own BSP method by extending from org.apache.hama.bsp.BSP class. Apache Hama provides in this class a user-defined function bsp() that can be used to write your own BSP program.

The bsp() function handles whole parallel part of the program. (So it just gets called once, not all over again)

There are also setup() and cleanup() which will be called at the beginning of your computation, respectively at the end of the computation.

cleanup() is guranteed to run after the computation or in case of failure.

You can simply override the functions you need from BSP class.

Basically, a BSP program consists of a sequence of supersteps. Each superstep consists of the three phases:

  • Local computation
  • Process communication
  • Barrier synchronization

NOTE that these phases should be always sequential order.

In Apache Hama, the communication between tasks (or peers) is done within the barrier synchronization.