Differences between revisions 57 and 58
Revision 57 as of 2014-03-14 00:52:03
Size: 9154
Editor: edwardyoon
Comment:
Revision 58 as of 2014-03-14 12:21:09
Size: 8425
Editor: edwardyoon
Comment:
Deletions are marked like this. Additions are marked like this.
Line 3: Line 3:
== PageRank Performance 0.6.4 vs 0.6.3 ==

 * NOTE: PageRank is an good example of communication-intensive processing
 * 2 nodes cluster (8 cores 16GB)
 * 6 tasks (opt: -Xmx2048m)
 * 40,000 vertices
 * Memory based Queue and Vertices storage (!ListVerticesInfo) was used (for both version)

||<rowbgcolor="#DDDDDD"> Edges per Vertex || 0.7.0-SNAPSHOT || Hama 0.6.3 ||
|| 500 || 359.518 seconds || 284.075 seconds ||
|| 600 || 440.835 seconds || 347.325 seconds ||
|| 700 || 521.53 seconds || 368.14 seconds ||
|| 800 || 605.78 seconds || Out Of Memory ||
|| 900 || 705.896 seconds || Out Of Memory ||
|| 1000 || 794.966 seconds || Out Of Memory ||
|| 1100 || 879.476 seconds || Out Of Memory ||
|| 1200 || 1032.919 seconds || Out Of Memory ||
|| 1300 || 1173.371 seconds || Out Of Memory ||
|| 1400 || 1058.794 seconds || Out Of Memory ||
== PageRank Benchmark test 0.6.4 vs 0.6.3 ==

See http://blog.datasayer.com/2014/03/apache-hama-benchmark-test-at-lg-cns.html

PageRank Benchmark test 0.6.4 vs 0.6.3

See http://blog.datasayer.com/2014/03/apache-hama-benchmark-test-at-lg-cns.html

Hama 0.6 RC3

  • Oracle BDA
  • 180 Tasks
  • SDP protocol disabled

Conditions

Execution Time

bench

16 10000 32

21.278 seconds

bench

16 100000 32

72.318 seconds

sssp

1 billion edges

473.52 seconds

pagerank

Google web graph dataset

9.15 seconds

  • HAMA-664

Conditions

Execution Time

bench

16 10000 32

15.266 seconds

bench

16 100000 32

51.293 seconds

Compressor Comparison Table

  • SSSP job on Random Graph
  • 153 Tasks
  • 2 Oracle BDAs cluster

Compressor

100 million edges

200 million edges

400 million edges

800 million edges

None

96.424 seconds

129.397 seconds

159.408 seconds

336.502 seconds

Bzip2Compressor

112.592 seconds

198.411 seconds

387.558 seconds

672.663 seconds

  • The first major part of the problem is inefficient calculation of bundle size. (CompressionUtil.getCompressionRatio() method)

PageRank

  • 1 Rack Oracle BDA
  • The dataset contains 5,716,808 pages and 130,160,392 links and is unzipped ~1gb large.

Tasks

Job Execution Time

170

18.465 seconds

Compare with old version

  • 1 Rack Oracle BDA was used.

Version

Physical Nodes

Tasks

MSG bytes

MSG exchanges per Superstep

Supersteps

Job Execution Time

Hama 0.5

18

162

16

1,600,000

32

93.322 seconds

Hama 0.4-incubating

18

162

16

1,600,000

32

288.435 seconds

Compare between Hadoop RPC and Avro

Messenger

Physical Nodes

Tasks

MSG bytes

MSG exchanges per Superstep

Supersteps

Job Execution Time

Hadoop RPC

16

100

16

10 million

32

168.571 seconds

Avro RPC

16

100

16

10 million

32

159.573 seconds

Hadoop RPC

16

100

16

10,000

2048

711.924 seconds

Avro RPC

16

100

16

10,000

2048

735.994 seconds

PageRank (PR)

  • Experimental environments
    • One rack (9 nodes 144 cores) cluster
    • 10G network
    • Hadoop 0.20.2
    • Hama 0.4 TRUNK r1235586.
    • MR based Random Graph Generator for Pagerank

    • Task and data partition based on hashing of vertextID in graph and size of input data.

Web pages (10 anchors per page)

Tasks

Supersteps

Job Execution Time

10 million

17

25

2596.858 seconds

10 million

90

25

551.194 seconds

20 million

35

25

2349.266 seconds

20 million

90

25

1122.242 seconds

30 million

54

25

2636.32 seconds

30 million

90

25

1629.504 seconds

Compare with old version

Version

Physical Nodes

Tasks

MSG bytes

MSG exchanges per Superstep

Supersteps

Job Execution Time

Hama 0.4-SNAPSHOT

16

160

16

160000

2048

1060.434 seconds

Hama 0.2-incubating

16

16

16

16000

2048

2030.187 seconds

K-Means Clustering

  • Experimental environments
    • One rack (16 nodes 256 cores) cluster
    • 10G network
    • Hadoop 0.20.2
    • Hama 0.4 TRUNK r1222075.
    • Job Execution Time includes the generation of a random dataset of N points and assigning the first k as initial centers.
    • Iterated input vectors on disk and kept centers in main memory.
    • Block partitioning has been used.

N

k

Dimension

Max Iteration

Tasks

Job Execution Time

1 million

10

2

10

1

21.863 seconds

10 million

10

2

10

5

55.339 seconds

10 million

10

2

10

5

45,614 seconds

10 million

10

2

10

20

36,156 seconds

10 million

10

2

10

40

33,199 seconds

100 million

10

2

10

43

81.826 seconds

200 million

10

2

10

85

127.026 seconds

200 million

10

2

10

85

236,682 seconds

http://people.apache.org/~tjungblut/downloads/benchmark_kmeans.png

N

k

Dimension

Max Iteration

Tasks

Job Execution Time

1 million

10

10

20

2

44.56 seconds

10 million

10

10

20

14

69.842 seconds

Single Shortest Path Problem

http://4.bp.blogspot.com/-nsYLwF_l3-c/TucnAg0GBCI/AAAAAAAAAU4/VSeUosa2Q58/s1600/scr.png

Vertices (x10 edges)

Tasks

Supersteps

Job Execution Time

10 million

6

5423

656.393 seconds

20 million

12

2231

449.542 seconds

30 million

18

4398

886.845 seconds

40 million

24

5432

1112.912 seconds

50 million

30

10747

2079.262 seconds

60 million

36

8158

1754.935 seconds

70 million

42

20634

4325.141 seconds

80 million

48

14356

3236.194 seconds

90 million

54

11480

2785.996 seconds

100 million

60

7679

2169.528 seconds

Random Communication Benchmark

4 racks

  • 1024 VM nodes (1024 cores)
  • 10G network

Hama 0.4

  • Work in progress

1 rack

  • 16 nodes (256 cores)
  • 10G network

Hama 0.4 (r.1177507)

  • 160 Tasks (10 tasks per node)

Size of each message

Messages per superstep

Number of supersteps

Job runtime

16 bytes

10,000

32

63.875 seconds

16 bytes

20,000

32

81.76 seconds

16 bytes

30,000

32

102.879 seconds

16 bytes

40,000

32

117.783 seconds

16 bytes

50,000

32

129.778 seconds

16 bytes

60,000

32

147.876 seconds

16 bytes

70,000

32

156.896 seconds

16 bytes

80,000

32

184.609 seconds

16 bytes

90,000

32

187.035 seconds

16 bytes

100,000

32

199.027 seconds


  • 16 Tasks (1 task per node)

Hama 0.3

Improved compared with 0.2-incubating.

Size of each message

Messages per superstep

Number of supersteps

0.2-incubating

0.3-incubating

16 bytes

1000

512

507.837 seconds

486.838 seconds

16 bytes

1000

1024

979.198 seconds

874.016 seconds

Hama 0.2

Test 1 (many small messages vs. few large messages)

Size of each message

Messages per superstep

Number of supersteps

Job runtime

500 bytes

100

16

7.461 seconds

500 bytes

100

32

9.397 seconds

500 bytes

100

64

14.341 seconds

500 bytes

100

128

26.394 seconds

500 bytes

100

256

43.411 seconds

500 bytes

100

512

84.489 seconds

500 bytes

100

1024

156.581 seconds

500 bytes

100

2048

308.671 seconds

Size of each message

Messages per superstep

Number of supersteps

Job runtime

10 kb

5

16

13.679 seconds

10 kb

5

32

23.427 seconds

10 kb

5

64

46.398 seconds

10 kb

5

128

86.476 seconds

10 kb

5

256

171.511 seconds

10 kb

5

512

339.608 seconds

10 kb

5

1024

675.994 seconds

10 kb

5

2048

1872.939 seconds

Test 2

Size of each message

Messages per superstep

Number of supersteps

Job runtime

16 bytes

1000

16

20.365 seconds

16 bytes

1000

32

36.386 seconds

16 bytes

1000

64

67.404 seconds

16 bytes

1000

128

126.503 seconds

16 bytes

1000

256

251.602 seconds

16 bytes

1000

512

507.837 seconds

16 bytes

1000

1024

979.198 seconds

16 bytes

1000

2048

2030.187 seconds

Benchmarks (last edited 2014-03-14 12:21:09 by edwardyoon)