Architecture of Graph Module

Hama includes the Graph module for vertex-centric graph computations. Hama's Graph APIs allows you to program Google's Pregel style applications with simple programming interface.


The Graph APIs are implemented on top of Hama BSP framework. It consists of three major classes: VertexInputReader, GraphJob, and GraphJobRunner.


The VertexInputReader is the user-defined interface for parsing and extracting the Vertex structure from arbitrary text and binary data. Internally, the loadVertices() method reads the records from assigned split, and then loads the converted Vertex objects by the user-defined VertexInputReader.parseVertex() method into memory Vertices storage.


GraphJob provides some additional Get/Set methods extending the core BSPJob interface for supporting the Graph specific configurations, such as setMaxIteration, setAggregatorClass, setVertexInputReaderClass, and setVertexOutputWriterClass. Rest APIs e.g., InputFormat, OutputFormat etc. are the same with core BSPJob interface.


The GraphJobRunner is the core internal BSP program which is performs vertex computations as defined in Vertex.compute() method, and creates output. It, like other BSP programs, consists of three methods: setup(), cleanup(), and bsp().


doInitialSuperstep and doSuperstep

Future Ideas and Challenges