You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 33 Next »

Hama Streaming

This article focuses on the usage of Hama Streaming with Python.

Setup

We hope that you have installed the latest version of Apache Hama, Streaming is available since 0.6.0.

If you haven't yet installed Hama, please go through the manual in the GettingStarted article.

For the Python version, you need Python 3.2, if you are not running it, have a look at the various tutorials to install it. So verify that you run the latest python version, a very quick way is to check if there is a python3.2 command, or the normal python interpreter tells you the correct version number.

Now you should start your HDFS deamons.

So for the first step, please change into the directory of your Hama installment. If you see the bin/conf and lib folder and a couple of jars, you are probably right.

Now let's start the Hama cluster:

bin/start-bspd.sh

Once started, you can get yourself familiar with the shell submitter of pipes and streaming jobs:

bin/hama pipes

Now a good way to start is to retrieve the Hama Streaming for Python from github by executing

git clone git://github.com/thomasjungblut/HamaStreaming.git

If you don't have git installed, no problem: you can download a zip file from the https://github.com/thomasjungblut/HamaStreaming.

In any case you should now find a "HamaStreaming" folder in your Hama home directory which contains several scripts.

Let's start by executing the usual Hello World application that already ships with streaming:

bin/hama pipes -streaming true -bspTasks 2 -interpreter python3.2 -cachefiles HamaStreaming/*.py -output /tmp/pystream-out/ -program HamaStreaming/BSPRunner.py -programArgs HamaStreaming/HelloWorldBSP.py
  • No labels