This effort is still a "work in progress". Please feel free to add comments.
HAMA is a distributed framework on Hadoop for massive matrix and graph computations, currently being incubated as one of the incubator project by the Apache Software Foundation.
The Hama project goal is to provide easy matrix/graph computing programming environment on the Hadoop (distributed system). We are focusing on are as follows:
Below diagram is illustrates the overall architecture of HAMA.
+--------------------------------------+ | Matrix/Graph Computing Program | User Applications +--------------------------------------+ +------------------------------------------+ | HAMA : BSP, Angrapa, ..., etc | Computing Engines +------------------------------------------+ +----------------------------------------------------+ | ZooKeeper | Distributed Locking Service +----------------------------------------------------+ +----------------------------------------------------+ | Hadoop : HDFS, HBase, ..., etc | Distributed Storage Systems +----------------------------------------------------+ |
// BSP job configuration BSPJob bsp = new BSPJob(); // Set the job name bsp.setJobName("BSP test job"); // Set in/output path and formatter bsp.setInputPath(conf, new Path("input path")); bsp.setOutputPath(conf, new Path("output path")); bsp.setInputFormat(MyInputFormat.class); bsp.setOutputFormat(MyOutputFormat.class); // Set the BSP code bsp.setBSPCode(BSPProgram.class); BSPJobClient.runJob(bsp); |