Important: In my eyes, the designs of all Superstep API based input partitioner, Spilling Queue, DiskVerticesInfo, and Serial (vertices) processing, are not perfect yet. More importantly, these issues are NOT urgent. In short, it should not block next releases. As described in HowToCommit, please discuss and clarify your plan on dev@ and Wiki, before coding or creating separate branches in Hama official svn repository. – Edward J. Yoon
Plans for 0.6.3 release
- Add Online CF and Semi-clustering examples
- HDFS 2.0 Compatibility
Plans for 0.6.2 release
- Add Hama Pipes
- Fix bugs of input partitioning
- Improve code quality
Plans for 0.6.1 release
- Improving memory efficiency
- BSP based parallel input partitioning
Long term Plans
Future works
- Spilling Queue
- Improve real-time usage
- Evaluation of large-scale Hama cluster
- Advanced job scheduler
- Fix YARN module
[Research Task] Advanced partitioning algorithms for distributing workloads of graph or matrix computations
- Related with http://wiki.apache.org/hama/CommunicationPatterns
- Add Machine learning/Matrix/Graph examples e.g., PSVM, Semi-clustering, Classification, ..., etc.
- and Fix bugs