Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • The SSSP (abbr. for Single Source Shortest Paths) algorithm described in the Google Pregel paper was used.
  • Introduces IO usage, partitioning based on hashing of vertextID, and collective communication.

Short summary of the algorithm

  • Initialize all vertices' cost to reach it to INFINITY, just the start vertex will have cost 0.
  • Initially send the new cost to all adjacent vertex containing the new cost plus the edge weight between them
  • Reviewing messages: if the cost coming from a message is lower than the actual cost, update the cost and send a message to the adjacent vertices, containing the new cost plus the edge weight between them (similar to the last step)
  • Repeat the last step until no updates can be made anymore.

Usage

No Format
bin/hama jar ../hama-0.4.0-examples.jar sssp <start vertex> <input path> <output path> [number of tasks]

...

The file that Hama can successfully parse is a TextFile that has the following layout:

No Format
Berlin	Frankfurt\tFrankfurt:20	Munich\tMunich:50
Frankfurt	Berlin\tBerlin:20	Munich\tMunich:10
Munich

This piece of text will adjacent Berlin to Frankfurt (with edge weight of 20) and Munich (with edge weight of 10). Munich is a dangling node, it has no outlinks. As you can see a vertex is always on the leftmost side (we call it the key-site), and the outlinks (to which other vertex it is connected to) are seperated by tabs (\t) as the following elements. SSSP needs edge weights, you must provide them by separating the name of the vertex with a colon ":". The weight must be an integer.

...