TaskRunnerLaunchImpl class

The main objective of TaskRunnerLaunchImpl is to launch TaskRunner through Yarn's ContainerManager.

The TaskRunnerLaunchImpl class handles two events CONTAINER_REMOTE_LAUNCH and CONTAINER_REMOTE_CLEANUP, which lead to launching a TaskRunner and killing a running TaskRunner respectively. These events come from SubQuery::allocateContainers(SubQuery) method.


In TaskRunner, a Task is created from the response (QueryUnitRequest) of 'getTask()' rpc call. Task contains three main attributes as follows:

Initially, a Task registers fetch URIs to fetchLauncher (ExecutorService) in order to pull data, and it restore a logical plan. Then, a physical operator tree is created from the logical plan via PhysicalPlannerImpl. Finally, Task::run() method is called, and then Task's status is changed to RUNNING.

Also, a running Task periodically sends a statusUpdate report to TaskRunnerListener via RPC. A StatusUpdate report includes a status and some statistics of the running task. If the running task is failed, TaskRunner sends a message 'fatal' to TaskRunnerListenerImpl. If the task is completed, TaskRunner sends a message 'done' to TaskRunnerListenerImpl.


For each execution block, TaskRunner is launched by Yarn's ContainerManager. TaskRunner processes a Task at a time. If TaskRunner has an available slot, it sends 'getTask' to TaskRunnerListner. If TaskRunner receives the response (QueryUnitRequest) of 'getTask' message, TaskRunner creates an instance of Task from the response.


It receives messages sent from a number of TaskRunners. It passes the received message as events to some event handlers, such as QueryUnitAttempt and TaskScheduler.

In the current implementation, there are four messages as follows:

A sequence diagram of statusUpdate, done, and fatal messages


A sequence diagram of getTask message


Architecture (last edited 2013-07-17 00:43:38 by HenrySaputra)