Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

If you want to know how to run a user code within other environments, have a look in the python example at the end.==

Text Based Protocol

...

The text based protocol is very easy. You have a finite amount of operations that are associated with a unique identifier (OP_CODE). Common operations in BSP are sending a message or starting the barrier synchronization. In Hama Streaming, every operation is ended with a newline (‘\n’) character, so everything is done with a single line of text. Of course there are operations that are going to use multiple lines to naturally separate information, however these are special cases.

...

Operation Code Name Operation Code identifier Comment START 0 First op code after fork SET_BSP_JOB_CONF 1 Get configuration values SET_INPUT_TYPES 2 Not used RUN_SETUP 3 Start of the setup function RUN_BSP 4 Start of the bsp function RUN_CLEANUP 5 Start of the cleanup function READ_KEYVALUE 6 Reads a key/value pair from input (Text Only) WRITE_KEYVALUE 7 Writes a key/value pair to output (Text Only) GET_MSG 8 Gets the next message in the queue GET_MSG_COUNT 9 Get how many messages are in the queue SEND_MSG 10 Send a message SYNC 11 Start the barrier synchronization GET_ALL_PEERNAME 12 Get all peer names GET_PEERNAME 13 Get the peer name of the current peer GET_PEER_INDEX 14 Get a peer name via index GET_PEER_COUNT 15 Get how many peers are there GET_SUPERSTEP_COUNT 16 Get the current Superstep counter REOPEN_INPUT 17 Reopens the input to reread CLEAR 18 Clears the messaging queue CLOSE 19 Closes the protocol ABORT 20 Not used DONE 21 Closes the protocol if task is done TASK_DONE 22 Yet another task is done op code REGISTER_COUNTER 23 Not used (please create a new issue for that) INCREMENT_COUNTER 24 Not used (please create a new issue for that) LOG 25 Not implemented in Pipes, but in streaming it sends child logging to the Java task.==

Acknowledgements

...

To detect whether the forked child has arrived at the next stage of an algorithm, we work with acknowledgements (short ACK), that have a special formatting. ACKS are formally expressed as "%ACK_" + OP_CODE + "%=", where OP_CODE is the operation to acknowledge.==

Initialization Sequence

...

After the child process has been forked from the Java BSP Task, a special initialization sequence is needed. Formally the sequence looks like this:

...

The three steps are called in sequential order, so after you have ACK’d the end of the setup, the Java code will immediately start telling you about that you need to start the bsp function. This applies also for the transition between bsp and cleanup function.==

BSPPeer Functionality

...

People familiar with the Hama BSP API know that there is a context object which gives access to the BSP functionality like send or sync. We think this is a cool design and you should take care of implementing it as well, so you can pass this peer in all the user implemented functions.

...

  • Destination peer name
  • Message as text
    Sync - %11%= After sync a special ACK will be in the next line: “%11%=_SUCCESS”
    So you should block until you received this. getCurrentMessage - %8%= “%%-1%%” if no message was found, else the message as text getSuperstepCount - %16%= The raw Superstep number getMessageCount - %9%= The raw message count getPeerName - %13%=
  • Followed by an index (plain integer), -1 for the name of this peer. The name of the peer as text getPeerIndex - %14%= The raw index in the peer array getAllPeerNames - %12%= A line how many peers are there (plain number), then n-lines with the peer names getPeerCount - %15%= Plain number of how many peers are there writeKeyValue - %7%=
  • Key as line
  • Value as line
    readKeyValue - %6%= Key and Value in two separate following lines. If no input available, both are equal to “%%-1%%”. reopenInput - %17%=

...

Closing Sequences

...

To determine if the child process successfully finished its computations, we have a closing sequence that is expected after you acknowledged your cleanup. The closing sequence is basically just sending TASK_DONE and DONE command to the Java process. After receiving this, the Java process will do the normal cleanup and finish the task itself, after sending these codes you can gracefully shutdown your process by letting it exit with a zero status.

...

Congratulation, you are now able to implement Hama streaming in other languages.=

Appendix

...

...

Known implementations

...

...


Running user code in the Python environment

...

In the Python runtime, you can pass a .py filename to the BSPRunner as argument, using import you can import the class, get the class via getattr and get an instance by instantiating this class.

...