Compare between compressed and uncompressed
- TODO: randbench, and PageRank.
Compare between Hadoop RPC and Avro
Messenger |
Physical Nodes |
Tasks |
MSG bytes |
MSG exchanges per Superstep |
Supersteps |
Job Execution Time |
Hadoop RPC |
16 |
100 |
16 |
10 million |
32 |
168.571 seconds |
Avro RPC |
16 |
100 |
16 |
10 million |
32 |
159.573 seconds |
Hadoop RPC |
16 |
100 |
16 |
10,000 |
2048 |
711.924 seconds |
Avro RPC |
16 |
100 |
16 |
10,000 |
2048 |
735.994 seconds |
PageRank (PR)
- Experimental environments
- One rack (9 nodes 144 cores) cluster
- 10G network
- Hadoop 0.20.2
- Hama 0.4 TRUNK r1235586.
- MR based Random Graph Generator (usable for 0.5.0)
- Task and data partition based on hashing of vertextID in graph and size of input data.
Web pages (10 anchors per page) |
Tasks |
Supersteps |
Job Execution Time |
10 million |
17 |
25 |
2596.858 seconds |
10 million |
90 |
25 |
551.194 seconds |
20 million |
35 |
25 |
2349.266 seconds |
20 million |
90 |
25 |
1122.242 seconds |
30 million |
54 |
25 |
2636.32 seconds |
30 million |
90 |
25 |
1629.504 seconds |
Compare with old version
Version |
Physical Nodes |
Tasks |
MSG bytes |
MSG exchanges per Superstep |
Supersteps |
Job Execution Time |
Hama 0.2-incubating |
16 |
16 |
16 |
16000 |
2048 |
2030.187 seconds |
Hama 0.4-SNAPSHOT |
16 |
160 |
16 |
160000 |
2048 |
1060.434 seconds |
K-Means Clustering
- Experimental environments
- One rack (16 nodes 256 cores) cluster
- 10G network
- Hadoop 0.20.2
- Hama 0.4 TRUNK r1222075.
- Job Execution Time includes the generation of a random dataset of N points and assigning the first k as initial centers.
- Block partitioning has been used.
N |
k |
Dimension |
Max Iteration |
Tasks |
Job Execution Time |
1 million |
10 |
2 |
10 |
1 |
21.863 seconds |
10 million |
10 |
2 |
10 |
5 |
55.339 seconds |
10 million |
10 |
2 |
10 |
5 |
45,614 seconds |
10 million |
10 |
2 |
10 |
20 |
36,156 seconds |
10 million |
10 |
2 |
10 |
40 |
33,199 seconds |
100 million |
10 |
2 |
10 |
43 |
81.826 seconds |
200 million |
10 |
2 |
10 |
85 |
127.026 seconds |
200 million |
10 |
2 |
10 |
85 |
236,682 seconds |
N |
k |
Dimension |
Max Iteration |
Tasks |
Job Execution Time |
1 million |
10 |
10 |
20 |
2 |
44.56 seconds |
10 million |
10 |
10 |
20 |
14 |
69.842 seconds |
Single Shortest Path Problem
- Experimental environments
- One rack (16 nodes 256 cores) cluster
- 10G network
- Hadoop 0.20.2
- Hama 0.4 TRUNK r1213634.
- MR based Random Graph Generator Random Graph Generator for 0.5.0
- Task and data partition based on hashing of vertextID in graph and size of input data.
http://4.bp.blogspot.com/-nsYLwF_l3-c/TucnAg0GBCI/AAAAAAAAAU4/VSeUosa2Q58/s1600/scr.png
Vertices (x10 edges) |
Tasks |
Supersteps |
Job Execution Time |
10 million |
6 |
5423 |
656.393 seconds |
20 million |
12 |
2231 |
449.542 seconds |
30 million |
18 |
4398 |
886.845 seconds |
40 million |
24 |
5432 |
1112.912 seconds |
50 million |
30 |
10747 |
2079.262 seconds |
60 million |
36 |
8158 |
1754.935 seconds |
70 million |
42 |
20634 |
4325.141 seconds |
80 million |
48 |
14356 |
3236.194 seconds |
90 million |
54 |
11480 |
2785.996 seconds |
100 million |
60 |
7679 |
2169.528 seconds |
Random Communication Benchmark
4 racks
- 1024 VM nodes (1024 cores)
- 10G network
Hama 0.4
- Work in progress
1 rack
- 16 nodes (256 cores)
- 10G network
Hama 0.4 (r.1177507)
- 160 Tasks (10 tasks per node)
Size of each message |
Messages per superstep |
Number of supersteps |
Job runtime |
16 bytes |
10,000 |
32 |
63.875 seconds |
16 bytes |
20,000 |
32 |
81.76 seconds |
16 bytes |
30,000 |
32 |
102.879 seconds |
16 bytes |
40,000 |
32 |
117.783 seconds |
16 bytes |
50,000 |
32 |
129.778 seconds |
16 bytes |
60,000 |
32 |
147.876 seconds |
16 bytes |
70,000 |
32 |
156.896 seconds |
16 bytes |
80,000 |
32 |
184.609 seconds |
16 bytes |
90,000 |
32 |
187.035 seconds |
16 bytes |
100,000 |
32 |
199.027 seconds |
- 16 Tasks (1 task per node)
Hama 0.3
Improved compared with 0.2-incubating.
Size of each message |
Messages per superstep |
Number of supersteps |
0.2-incubating |
0.3-incubating |
16 bytes |
1000 |
512 |
507.837 seconds |
486.838 seconds |
16 bytes |
1000 |
1024 |
979.198 seconds |
874.016 seconds |
Hama 0.2
Test 1 (many small messages vs. few large messages)
Size of each message |
Messages per superstep |
Number of supersteps |
Job runtime |
500 bytes |
100 |
16 |
7.461 seconds |
500 bytes |
100 |
32 |
9.397 seconds |
500 bytes |
100 |
64 |
14.341 seconds |
500 bytes |
100 |
128 |
26.394 seconds |
500 bytes |
100 |
256 |
43.411 seconds |
500 bytes |
100 |
512 |
84.489 seconds |
500 bytes |
100 |
1024 |
156.581 seconds |
500 bytes |
100 |
2048 |
308.671 seconds |
Size of each message |
Messages per superstep |
Number of supersteps |
Job runtime |
10 kb |
5 |
16 |
13.679 seconds |
10 kb |
5 |
32 |
23.427 seconds |
10 kb |
5 |
64 |
46.398 seconds |
10 kb |
5 |
128 |
86.476 seconds |
10 kb |
5 |
256 |
171.511 seconds |
10 kb |
5 |
512 |
339.608 seconds |
10 kb |
5 |
1024 |
675.994 seconds |
10 kb |
5 |
2048 |
1872.939 seconds |
Test 2
Size of each message |
Messages per superstep |
Number of supersteps |
Job runtime |
16 bytes |
1000 |
16 |
20.365 seconds |
16 bytes |
1000 |
32 |
36.386 seconds |
16 bytes |
1000 |
64 |
67.404 seconds |
16 bytes |
1000 |
128 |
126.503 seconds |
16 bytes |
1000 |
256 |
251.602 seconds |
16 bytes |
1000 |
512 |
507.837 seconds |
16 bytes |
1000 |
1024 |
979.198 seconds |
16 bytes |
1000 |
2048 |
2030.187 seconds |