...
- NORMAL: Everything works fine.
- STALE: This happens when DiskErrorException is thrown.
- DENIED: This indicates failing to establish connection to BSPMaster.
Task Management
The GroomServer
- receives instructions from BSPMasters.
- spawns one or more tasks as separated jvm processes where tasks are then executed.
- monitors spawned processes via ping; when a task is
- out of contact (failure/ crashed): launch a new process and restart the task with max attempt set to 3
- exceeding max attempt: update task status/ notify BSPMasters
- sends heartbeat to BSPMasters