Isolating YARN Applications in Docker Containers
The Docker executor for YARN involves work on YARN along with its counterpart in Docker to forge the necessary API end points. The purpose of this page is to collect related tickets across both projects in one location.
The advantages Containers and Docker offer to Hadoop YARN are well understood:
* Security. YARN is typically deployed as a multi-tenant environment in large organizations with multiple groups sharing a common IT-managed cluster. Tasks from different tenants could potentially be scheduled on the same host. Containers securely isolate those tasks by limiting the privilege scope of a task to the container in which it runs. Root in the container is distinct from root on the host. Even though the root in a container could run privileged operations, it only affects the container counterparts of the host resources but not the host directly. Specific Linux Capabilities possessed by the task, devices accessible to it, etc. are adjusted for each container.
When combined with Software Defined Networking techniques, containers isolate the network traffic of different tenant applications. Then the tasks of one customer would not be able to maliciously or unintentionally snoop the traffic of another tenant.
* Performance isolation. Containers provide resource accounting and enforce resource limits on the processes running within them to prevent applications from stepping on each other. For fine-grain control, resource limits associated with CPU, memory and I/O bandwidth can be tuned on-the-fly as decided by the resource manager.
* Higher utilization by co-scheduling CPU and I/O bound jobs. In a multitenant environment, applications have varying resource needs. While some tasks are compute intensive, others could be I/O-bound. When the tasks of an I/O bound job are scheduled on a node, its compute resources go unused and vice versa. Due to the security risk of co-locating the tasks of different tenants on a shared machine, the idle resources are not allocated to other tenants even if they are able to utilize them. Containers prevent such resource underutilization by securely isolating tasks from one another, so that they can be safely co-scheduled on the same host.
* Consistency. Distributed YARN applications consist of tasks that need to run on different cluster nodes deployed with an identical host environment. Any discrepancies may cause application misbehavior. Containers ensure that all the tasks of an application run in a consistent software environment defined by the container and its image, regardless of the state of the host. For example, an application could run in an Ubuntu environment making use of Ubuntu-specific software, while the host itself runs RHEL.
* Isolation of software dependencies and configuration. YARN is designed to be modular, with well-defined interfaces between applications and its core. This allows applications to be built as independent binaries which often rely on third party software. For example, an application that predicts consumer spending based on linear regression might have a dependency on Matlab. Since the tasks of an application could be potentially scheduled to run on any host in the cluster, these software dependencies would have to be installed on all the cluster nodes. A variety of applications all sharing the same YARN cluster can quickly clutter the nodes with their respective software dependencies. Installing all dependencies across all hosts is an unscalable approach. In some cases, the software dependencies and their versions may be mutually conflicting.
With applications encapsulated in Docker containers, software dependencies and the system configuration required for them can be specified independent of the host and other applications running on the cluster.
* Reproducible and programmable mechanism to define application environments. Docker supports a mechanism to programmatically build out a consistent environment required for YARN applications. The build process can be run offline with its products stored in the central repository of container images. At the time of deployment, the image bits are quickly streamed into the cluster without incurring the overhead of runtime configuration.
* Rapid provisioning. The central repository of container images decouples software state and configuration from the hardware, enabling a relatively stateless base platform to be rapidly provisioned for a YARN application, by automatically pulling the right container image on demand. When the job finishes the containers are simply removed, returning the cluster to its pristine state.
Realizing these benefits requires changes to both Docker and YARN. Several of the necessary Docker features for the above such as excluding intermediate data directory from copy-on-write file system and adding data node Unix socket from host into the container for short-circuit IO are already available. The following new pieces of work needs to be done.
YARN Docker executor
An initial patch of Docker executor.
- Some of the Docker features below may only be made available via its REST endpoint. Docker executor should connect to it rather than shell out to invoke those functions.
Docker support for user namespaces to map root user in the container to an unprivileged user on the host. Currently root in a Docker container has root privileges on the host.
Container network configuration that allows the task and application master containers to talk to each other. The NAT'ed non-routable IP addresses assigned by Docker don't allow the task to reach the application master running in a container on a different host. Possible approaches to addressing this and relevant tickets are outlined here.
Dynamic tuning of resource limits for granular control over resources allocation. Docker currently does not allow changing container resources once created.