Requirements

Current Hama requires JRE 1.6 or higher and ssh to be set up between nodes in the cluster:

For additional information consult our CompatibilityTable.

Download

You can download Hadoop here:

http://www.apache.org/dyn/closer.cgi/hadoop/core/

You can download Hama here:

http://www.apache.org/dyn/closer.cgi/hama

Build latest version from source

If you're going to use latest (unreleased) version, you can check out TRUNK and build it with maven 3 as following commands:

% svn co https://svn.apache.org/repos/asf/hama/trunk hama-trunk
% cd hama-trunk

% mvn clean install -Phadoop1 -Dhadoop.version=1.x.x
or
% mvn clean install -Phadoop2 -Dhadoop.version=2.x.x

See also HowToContribute

Hadoop Installation

 % rm -rf ./lib/hadoop*.jar
 % cp /usr/lib/hadoop/hadoop-test-0.20.2-cdh3u3b.jar ./lib/
 % cp /usr/lib/hadoop/hadoop-core-0.20.2-cdh3u3b.jar ./lib/
 % cp /usr/lib/hadoop/lib/guava-r09-jarjar.jar ./lib/

 % bin/start-bspd.sh

Hama Installation

Untar the files to your destination of choice:

tar -xzf hama-0.x.0.tar.gz

Don't forget to chown the directory as the same user you configured Hadoop in the step before.

Startup script

The $HAMA_HOME/bin directory contains some script used to start up the Hama daemons.

Note: You have to start Hama with the same user which is configured for Hadoop.

Configuration files

The $HAMA_HOME/conf directory contains some configuration files for Hama. These are:

Setting up Hama

This section describes how to get started by setting up a Hama cluster.

Modes

Just like Hadoop, we distinct between three modes:

Local Mode

This mode is the default mode if you download Hama (>= 0.3.0) and install it. When submitting a job it will run a local multithreaded BSP Engine on your server. It can be configured via the bsp.master.address property to local. You can adjust the number of threads used in this utility by setting the bsp.local.tasks.maximum property. See the Settings step how and where to configure this.

Note: In this mode, nothing must be launched via the start scripts.

Pseudo Distributed Mode

This mode is when you just have a single server and want to launch all the deamon processes (BSPMaster, Groom and Zookeeper). It can be configured when you set the bsp.master.address to a host address, e.G. localhost and put the same address into the groomservers file in the configuration directory. As stated it will run a BSPMaster, a Groom and a Zookeeper on your machine.

Distributed Mode

This mode is just like the "Pseudo Distributed Mode", but you have multiple machines, which are mapped in the groomservers file.

Settings

An example of a hama-site.xml file:

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
  <property>
    <name>bsp.master.address</name>
    <value>host1.mydomain.com:40000</value>
    <description>The address of the bsp master server. Either the
    literal string "local" or a host:port for distributed mode
    </description>
  </property>

  <property>
    <name>fs.default.name</name>
    <value>hdfs://host1.mydomain.com:9000/</value>
    <description>
      The name of the default file system. Either the literal string
      "local" or a host:port for HDFS.
    </description>
  </property>

  <property>
    <name>hama.zookeeper.quorum</name>
    <value>host1.mydomain.com,host2.mydomain.com</value>
    <description>Comma separated list of servers in the ZooKeeper Quorum.
    For example, "host1.mydomain.com,host2.mydomain.com,host3.mydomain.com".
    By default this is set to localhost for local and pseudo-distributed modes
    of operation. For a fully-distributed setup, this should be set to a full
    list of ZooKeeper quorum servers. If HAMA_MANAGES_ZK is set in hama-env.sh
    this is the list of servers which we will start/stop zookeeper on.
    </description>
  </property>
</configuration>

If you are managing your own ZooKeeper, you have to specify the port number as below:

  <property>
    <name>hama.zookeeper.property.clientPort</name>
    <value>2181</value>
  </property>

Starting a Hama cluster

Skip this step if you're in Local Mode.

Run the command:

% $HAMA_HOME/bin/start-bspd.sh

This will startup a BSPMaster, GroomServers and Zookeeper on your machine.

Stopping a Hama cluster

Run the command:

% $HAMA_HOME/bin/stop-bspd.sh

to stop all the daemons running on your cluster.

Run the BSP Examples

Run the command:

% $HAMA_HOME/bin/hama jar hama-examples-0.x.0.jar

It will then offer you some examples to choose. Refer to our Examples site if you have additional questions how to use them.

Hama Web Interfaces

The web UI provides information about BSP job statistics of the Hama cluster, running/completed/failed jobs.

By default, it’s available at http://localhost:40013

Setup Hama in your Eclipse Workspace

Step by step guide to let Hama run in your eclipse workspace with a localrunner:

GettingStarted (last edited 2013-09-27 14:39:40 by edwardyoon)