Table of Contents |
---|
Overview
TODO:
In Apache Hama, you can implement your own BSP method by extending from org.apache.hama.bsp.BSP
class. Apache Hama provides in this class a user-defined function bsp()
that can be used to write your own BSP program.
The bsp()
function handles whole parallel part of the program. (So it just gets called once, not all over again)
There are also setup()
and cleanup()
which will be called at the beginning of your computation, respectively at the end of the computation.
cleanup()
is guranteed to run after the computation or in case of failure.
You can simply override the functions you need from BSP class.
The Hama is based on the Bulk Synchronous Parallel Model (BSP), in which a computation involves a number of supersteps, each having many distributed computational peers that synchronize at the end of the superstep. Basically, a BSP program consists of a sequence of supersteps. Each superstep consistsof consists of the following three phases:
- Local computation
- Process communication
- Barrier synchronization
NOTE that these phases should be always sequential order.
In Apache Hama, the communication between tasks (or peers) is done within the barrier synchronization.
BSP Function
Hama provides a user-defined function “bsp()” that can be used to write your own BSP program. The bsp() function handles the whole parallel part of the program. It only takes one argument "BSPPeer", which contains an communication, counters, and IO interfaces.
...
No Format |
---|
public static class SumCombiner extends Combiner { @Override public BSPMessageBundle combine(Iterable<BSPMessage> messages) { BSPMessageBundle bundle = new BSPMessageBundle(); int sum = 0; Iterator<BSPMessage> it = messages.iterator(); while (it.hasNext()) { sum += ((IntegerMessage) it.next()).getData(); } bundle.addMessage(new IntegerMessage("Sum", sum)); return bundle; } } |
Job Configuration and Submission
...
TODO:
— General Information —
In Apache Hama, you can implement your own BSP method by extending from org.apache.hama.bsp.BSP
class. Apache Hama provides in this class a user-defined function bsp()
that can be used to write your own BSP program.
The bsp()
function handles whole parallel part of the program. (So it just gets called once, not all over again)
There are also setup()
and cleanup()
which will be called at the beginning of your computation, respectively at the end of the computation.
cleanup()
is guranteed to run after the computation or in case of failure.
You can simply override the functions you need from BSP class.
Basically, a BSP program consists of a sequence of supersteps. Each superstep consists of the three phases:
- Local computation
- Process communication
- Barrier synchronization
NOTE that these phases should be always sequential order.
In Apache Hama, the communication between tasks (or peers) is done within the barrier synchronization.