Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

User-defined partitioning

The partitioner is designed for In Hama BSP computing framework, the Partition function is used for obtaining scalability of a Bulk Synchronous Parallel processing, and determining how to distribute the slices of input data among computing workers of a Bulk Synchronous Parallel processing. Remember, this is not related with output collection, unlike Map/Reduce's partition function.BSP processors. Unlike MapReduce data processing model, many scientific algorithms based on Message-Passing Bulk Synchronous Parallel model often requires that a processor obtain “nearby or related” data from other processors in order to complete the processing. In this case, processors determine their communication partners, or neighbors using Partition function.

Internally, Input data-partitioning works as following sequence:

...