Short-term Issues (for 0.2.0 release)

http://markmail.org/search/?q=hama-dev+discuss#query:hama-dev%20discuss+page:1+mid:amlvccbptom3yro3+state:results

Re-factoring issues

BSP issues


Long-term Issues

We have a plan to redesign Hama to be based on BSP model and be specified to shared nothing systems consisting of several thousands commodity servers, which is generally called cloud computing environments.

Why BSP?

In respect of graph package, BSP is also necessary for Hama to process graph data efficiently in shared-nothing architectures. The essence of graph data is connectivities between vertices. During processing, Hama will need not only some vertex's data but also its adjacent vertices' data. Assume that we have a graph data set that partitioned to some cohesive subgraphs. That is, the adjacent vertices can be saved in the same physical storage or near storage as possible. Although we have well-partitioned graphs, MapReduce doesn't exploit its characteristic since it reads input data sequentially and it can’t control its input data. In addition, its partitioner hashes the input data. However, BSP mode can enable graph processing to be performed efficiently while preserving the locality of graph data.

Design Considerations

(Working)

TODO

(Working)

Idea Generating and Research Tasks

This section deals with the types of tasks that includes overall architecture/performance related abstract issues. – edyoon