...
- 2 node cluster
- Quanta S210 X22RQ-3
- CPU: Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz x 2
- MEM: 192 GB
- HDD: 3TB x 1 (HW RAID, Physical HDD x 2)
- NIC: 1G x 2, 10G x 4
- Same input data (Random generated JSON format text files) was used
- Same child opts (-Xmx4048m)
- Same number of tasks was used
- Same hash partitioner was used
NOTE: Hama allows you to generate a random graph data in JSON format, which is easily compatible with other similar systems. Try it on your cluster!
The number of total edges | Hama-0.7.0 | Giraph-1.2.0 |
300M | 115 seconds | 130 seconds |
270M | 106 seconds | 124 seconds |
240M | 100 seconds | 113 seconds |
210M | 97 seconds | 104 seconds |
180M | 88 82 seconds | 89 seconds |
150M | 67 seconds | 79 seconds |
120M | 61 seconds | 72 seconds |
90M | 52 seconds | 61 seconds |
60M | 40 seconds | 50 seconds |
30M | 28 seconds | 40 seconds |
What are the major changes from the last release?
The major improvement changes are in the queue and messaging systems. We now use own outgoing/incoming message manager instead of using Java's built-in queues. It stores messages in serialized form in a set of bundles (or a single bundle) to reduce the memory usage and RPC overhead. Kryo serializer is used to serialize objects more quickly. Another important improvement is the enhanced graph package. Instead of sending each message individually, we package the messages per vertex and send a packaged message to their assigned destination nodes. The thread-pool executor service also used for each vertex computation. With this, we achieve better performance.
...