Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

In NoSQLs table input case (which supports range or random access by sorted key), partitions doesn't need to be rewritten. In addition, Scanner instead of basic 'region' or 'tablet' splits can be used for forcing the number of processors.

...

pre-processing step will be skipped because they supports range scan.

  • Job Scheduler assigns Scanner or tablet with its partition ID to proper task, launch the BSP job.

Partitioning internals in Graph module

The internals of the Graph module implemented on top of BSP framework, are pretty simple. Input data partitioning will be done at BSP framework level. Each GraphJobRunner processors just reads assigned splits and converts to Vertex into memory (Currently disk-based vertices store and sequential vertex processing are not perfect), and loads parsed vertices into vertices storage at loadVertices() method. If you want to learn details and internals about Graph job, Please see also Design of Graph Module.

Create your own Partitioner

...