NOTE: Hama allows you to generate a random graph data in JSON format, which is easily compatible with other similar systems. Try it on your cluster!
Vertices |
The max-edges per vertex |
Tasks |
Hama-0.7.0 |
Giraph-1.2.0 |
300000 |
1000 |
30 |
70.425 seconds |
154 seconds |
300000 |
900 |
30 |
64.388 seconds |
141 seconds |
300000 |
800 |
30 |
58.355 seconds |
130 seconds |
300000 |
700 |
30 |
49.309 seconds |
114 seconds |
300000 |
600 |
30 |
44.334 seconds |
104 seconds |
300000 |
500 |
30 |
40.321 seconds |
92 seconds |
300000 |
400 |
30 |
37.304 seconds |
81 seconds |
300000 |
300 |
30 |
31.248 seconds |
72 seconds |
Vertices |
The max-edges per vertex |
Tasks |
Hama-0.7.0 |
Giraph-1.2.0 |
30000 |
10000 |
20 |
72.485 seconds |
130 seconds |
30000 |
9000 |
20 |
70.399 seconds |
124 seconds |
30000 |
8000 |
20 |
58.388 seconds |
113 seconds |
30000 |
7000 |
20 |
49.39 seconds |
104 seconds |
30000 |
6000 |
20 |
47.153 seconds |
89 seconds |
30000 |
5000 |
20 |
40.34 seconds |
79 seconds |
30000 |
4000 |
20 |
37.292 seconds |
72 seconds |
30000 |
3000 |
20 |
31.483 seconds |
61 seconds |
What are the major changes from the last release?
The major improvement changes are in the queue and messaging systems. We now use own outgoing/incoming message manager instead of using Java's built-in queues. It stores messages in serialized form in a set of bundles (or a single bundle) to reduce the memory usage and RPC overhead. Another important improvement is the enhanced graph package. Instead of sending each message individually, we package the messages per vertex and send a packaged message to their assigned destination nodes. The thread-pool executor service also used for each vertex computation. With this, we achieve better performance.
|
Input Size (Bytes) |
Total Tasks |
Total Edges |
Execution Time (0.6.4) |
Execution Time (0.7.0-SNAPSHOT) |
pagerank |
905310 |
6 |
200,000 |
16.71 seconds |
13.598 secs |
pagerank |
7065912 |
6 |
2,000,000 |
19.514 seconds |
16.524 seconds |
pagerank |
857467170 |
6 |
200,000,000 |
1133.622 seconds |
202.996 seconds |
pagerank |
857467170 |
10 |
200,000,000 |
770.589 seconds |
145.751 seconds |
pagerank |
1437500140 |
10 |
300,000,000 |
1337.851 seconds |
229.847 seconds |
See http://blog.datasayer.com/2014/03/apache-hama-benchmark-test-at-lg-cns.html
|
Conditions |
Execution Time |
bench |
16 10000 32 |
21.278 seconds |
bench |
16 100000 32 |
72.318 seconds |
sssp |
1 billion edges |
473.52 seconds |
pagerank |
Google web graph dataset |
9.15 seconds |
|
Conditions |
Execution Time |
bench |
16 10000 32 |
15.266 seconds |
bench |
16 100000 32 |
51.293 seconds |
Compressor |
100 million edges |
200 million edges |
400 million edges |
800 million edges |
None |
96.424 seconds |
129.397 seconds |
159.408 seconds |
336.502 seconds |
112.592 seconds |
198.411 seconds |
387.558 seconds |
672.663 seconds |
Tasks |
Job Execution Time |
170 |
18.465 seconds |
Version |
Physical Nodes |
Tasks |
MSG bytes |
MSG exchanges per Superstep |
Supersteps |
Job Execution Time |
Hama 0.5 |
18 |
162 |
16 |
1,600,000 |
32 |
93.322 seconds |
Hama 0.4-incubating |
18 |
162 |
16 |
1,600,000 |
32 |
288.435 seconds |
Messenger |
Physical Nodes |
Tasks |
MSG bytes |
MSG exchanges per Superstep |
Supersteps |
Job Execution Time |
Hadoop RPC |
16 |
100 |
16 |
10 million |
32 |
168.571 seconds |
Avro RPC |
16 |
100 |
16 |
10 million |
32 |
159.573 seconds |
Hadoop RPC |
16 |
100 |
16 |
10,000 |
2048 |
711.924 seconds |
Avro RPC |
16 |
100 |
16 |
10,000 |
2048 |
735.994 seconds |
Web pages (10 anchors per page) |
Tasks |
Supersteps |
Job Execution Time |
10 million |
17 |
25 |
2596.858 seconds |
10 million |
90 |
25 |
551.194 seconds |
20 million |
35 |
25 |
2349.266 seconds |
20 million |
90 |
25 |
1122.242 seconds |
30 million |
54 |
25 |
2636.32 seconds |
30 million |
90 |
25 |
1629.504 seconds |
Version |
Physical Nodes |
Tasks |
MSG bytes |
MSG exchanges per Superstep |
Supersteps |
Job Execution Time |
Hama 0.4-SNAPSHOT |
16 |
160 |
16 |
160000 |
2048 |
1060.434 seconds |
Hama 0.2-incubating |
16 |
16 |
16 |
16000 |
2048 |
2030.187 seconds |
N |
k |
Dimension |
Max Iteration |
Tasks |
Job Execution Time |
1 million |
10 |
2 |
10 |
1 |
21.863 seconds |
10 million |
10 |
2 |
10 |
5 |
55.339 seconds |
10 million |
10 |
2 |
10 |
5 |
45,614 seconds |
10 million |
10 |
2 |
10 |
20 |
36,156 seconds |
10 million |
10 |
2 |
10 |
40 |
33,199 seconds |
100 million |
10 |
2 |
10 |
43 |
81.826 seconds |
200 million |
10 |
2 |
10 |
85 |
127.026 seconds |
200 million |
10 |
2 |
10 |
85 |
236,682 seconds |
http://people.apache.org/~tjungblut/downloads/benchmark_kmeans.png
N |
k |
Dimension |
Max Iteration |
Tasks |
Job Execution Time |
1 million |
10 |
10 |
20 |
2 |
44.56 seconds |
10 million |
10 |
10 |
20 |
14 |
69.842 seconds |
http://4.bp.blogspot.com/-nsYLwF_l3-c/TucnAg0GBCI/AAAAAAAAAU4/VSeUosa2Q58/s1600/scr.png
Vertices (x10 edges) |
Tasks |
Supersteps |
Job Execution Time |
10 million |
6 |
5423 |
656.393 seconds |
20 million |
12 |
2231 |
449.542 seconds |
30 million |
18 |
4398 |
886.845 seconds |
40 million |
24 |
5432 |
1112.912 seconds |
50 million |
30 |
10747 |
2079.262 seconds |
60 million |
36 |
8158 |
1754.935 seconds |
70 million |
42 |
20634 |
4325.141 seconds |
80 million |
48 |
14356 |
3236.194 seconds |
90 million |
54 |
11480 |
2785.996 seconds |
100 million |
60 |
7679 |
2169.528 seconds |
Size of each message |
Messages per superstep |
Number of supersteps |
Job runtime |
16 bytes |
10,000 |
32 |
63.875 seconds |
16 bytes |
20,000 |
32 |
81.76 seconds |
16 bytes |
30,000 |
32 |
102.879 seconds |
16 bytes |
40,000 |
32 |
117.783 seconds |
16 bytes |
50,000 |
32 |
129.778 seconds |
16 bytes |
60,000 |
32 |
147.876 seconds |
16 bytes |
70,000 |
32 |
156.896 seconds |
16 bytes |
80,000 |
32 |
184.609 seconds |
16 bytes |
90,000 |
32 |
187.035 seconds |
16 bytes |
100,000 |
32 |
199.027 seconds |
Improved compared with 0.2-incubating.
Size of each message |
Messages per superstep |
Number of supersteps |
0.2-incubating |
0.3-incubating |
16 bytes |
1000 |
512 |
507.837 seconds |
486.838 seconds |
16 bytes |
1000 |
1024 |
979.198 seconds |
874.016 seconds |
Size of each message |
Messages per superstep |
Number of supersteps |
Job runtime |
500 bytes |
100 |
16 |
7.461 seconds |
500 bytes |
100 |
32 |
9.397 seconds |
500 bytes |
100 |
64 |
14.341 seconds |
500 bytes |
100 |
128 |
26.394 seconds |
500 bytes |
100 |
256 |
43.411 seconds |
500 bytes |
100 |
512 |
84.489 seconds |
500 bytes |
100 |
1024 |
156.581 seconds |
500 bytes |
100 |
2048 |
308.671 seconds |
Size of each message |
Messages per superstep |
Number of supersteps |
Job runtime |
10 kb |
5 |
16 |
13.679 seconds |
10 kb |
5 |
32 |
23.427 seconds |
10 kb |
5 |
64 |
46.398 seconds |
10 kb |
5 |
128 |
86.476 seconds |
10 kb |
5 |
256 |
171.511 seconds |
10 kb |
5 |
512 |
339.608 seconds |
10 kb |
5 |
1024 |
675.994 seconds |
10 kb |
5 |
2048 |
1872.939 seconds |
Size of each message |
Messages per superstep |
Number of supersteps |
Job runtime |
16 bytes |
1000 |
16 |
20.365 seconds |
16 bytes |
1000 |
32 |
36.386 seconds |
16 bytes |
1000 |
64 |
67.404 seconds |
16 bytes |
1000 |
128 |
126.503 seconds |
16 bytes |
1000 |
256 |
251.602 seconds |
16 bytes |
1000 |
512 |
507.837 seconds |
16 bytes |
1000 |
1024 |
979.198 seconds |
16 bytes |
1000 |
2048 |
2030.187 seconds |