Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

This article focuses on the usage of Hama Streaming with Python.

Setup

We hope that you have installed the latest version of Apache Hama, Streaming is available since 0.6.0.

If you haven't yet installed Hama, please go through the manual in the GettingStarted article.

For the Python version, you need Python 3.2, if you are not running it, have a look at the various tutorials to install it. So verify that you run the latest python version, a very quick way is to check if there is a python3.2 command, or the normal python interpreter tells you the correct version number.

Now you should start your HDFS deamons.

So for the first step, please change into the directory of your Hama installment. If you see the bin/conf and lib folder and a couple of jars, you are probably right.

Now let's start the Hama cluster:

No Format

bin/start-bspd.sh

Once started, you can get yourself familiar with the shell submitter of pipes and streaming jobs:

No Format

bin/hama pipes

Now a good way to start is to retrieve the Hama Streaming for Python from github by executing

No Format

git clone git://github.com/thomasjungblut/HamaStreaming.git

If you don't have git installed, no problem: you can download a zip file from the https://github.com/thomasjungblut/HamaStreaming.

In any case you should now find a "HamaStreaming" folder in your Hama home directory which contains several scripts.

Let's start by executing the usual Hello World application that already ships with streaming:

No Format