Hama (means a hippopotamus in Korean) is a distributed scientific package on Hadoop for massive matrix and graph data. It is currently in incubation with Apache. The main goal of Hama is to provide computational tools for data-intensive scientific and industrial areas. It consists of two packages, which are the matrix package and the graph package.
The matrix package (means a hippopotamus in Korean) is a library of matrix operations on a Map/Reduce framework for a large-scale numerical analysis and data mining, that need the intensive computation power of matrix inversion, e.g., linear regression, PCA, SVM and etc. It will be useful for many scientific applications, e.g., physics computations, linear algebra, computational fluid dynamics, statistics, graphic rendering and many more.
The graph package, is an large-scale graph data management framework for analytical processing. It is still an ongoing project. It will employ massive parallelism on Hadoop. It aims to achieve the scalability for tera bytes or peta bytes graph data. Angrapa will be used in a variety of scientific and industrial areas, such as data mining, machine learning, information retrieval, bioinformatics, and social networks, required to process large-scale graph data.
- Scientific simulation and modeling
- Matrix-vector/matrix-matrix multiply
- Soving linear systems
- Scientific graphs
- Scientific Business Intelligence
- Information retrieval
- Sorting
- Finding eigenvalues and eigenvectors
- Computer graphics and computational geometry
- Matrix multiply
- Computing matrix determinate
General Information
- Hama Homepage
- Hama Architecture and 0.1 Plans – Work in progress
- Hama DSL (Domain Specific Language) in Groovy – Work in progress
- Hama Shell – Work in progress
- Presentations and Articles about Hama
- Getting Started with Hama
- Hama Mailing Lists
- Hama IRC Channel
- Hama Performance Evaluation
- PoweredBy, a list of sites and applications powered by Hama
User Documentation
- Examples
Developer Documentation
computing framework based on BSP (Bulk Synchronous Parallel) computing techniques for massive scientific computations (e.g., matrix, graph, network, ..., etc).
General Information
- Hama Official Website
- Hama Official Blog
- Presentations and Articles about Hama and BSP
- PoweredBy, a list of sites and applications powered by Hama
- Hama Architecture
- BSP Programming Model
- Performance Benchmarks
User Documentation
- Getting Started with Hama
- Getting Started with Hama on Mesos
- Getting Started with Hama on YARN
- Running Hama over InfiniBand
- Hama Pipes (Native C/C++ BSP Bridge)
- Build Dynamic Graphs on Hama
- Hama Aggregators
- Command Line Interfaces for Hama shell script.
- How to debug your own Applications
- Configuring third party JAR with native library
- FAQ list
- Hama Streaming
- Examples
- Advanced Documentation
Developer Documentation
- Guide for Hama Contributors
- Guide for Hama Committers
- Guide for Hama PMC memers
- Roadmap, listing release plans
- Building, Testing, CI
- How to release
- Developer FAQ
- Hama Streaming Protocol
- Guidelines provides information that developers can follow.