MXNet nGraph integration using subgraph backend interface

Problem

As the diversity of deep learning hardware accelerators increase, it is important to have an efficient abstraction layer so developers can avoid having to enable each accelerator/compute separately. Intel nGraph enables that vision as shown below.

The new subgraph acceleration API enables us to integrate nGraph library with minimal changes to core MXNet components.

The nGraph Compiler provides an in-memory abstraction layer for converting the mathematical representation of a DL model into an optimized-for-execution format that can be understood by and run on multiple hardware backends; this means that it can handle both potential graph partitioning schemes and allow MXNet to work on the nodes where the fusion happens.

Goals/Usecases

The primary goal of this integration is to provide a seamless development and deployment experience to data scientists and machine learning engineers to leverage Intel nGraph ecosystem with MXNet.

As Subgraph API seamlessly integrates with MXNet frontend API, users should just be able to use or switch nGraph backend with any existing MXNet scripts, models and deployments using the symbolic interface.

Proposed Approach

As described in subgraph acceleration API design document, we extend and customize SubgraphProperty and SubgraphSelector classes to enable nGraph integration.

https://github.com/apache/incubator-mxnet/pull/12502

SgNgraphSelector inherits SubgraphSelector to provide an interface to the subgraph selection mechanism, and SgNgraphProperty inherits Subgraph Property to provide a way to create nGraph subgraphs.

When compiling with nGraph, NDArray now contains a method get_tensor to enable memory sharing between nGraph and MXNet. This enables us to avoid copying or reallocating memory.

After building MXNet with nGraph support, users can enable nGraph backend by setting MXNET_SUBGRAPH_BACKEND="ngraph" environmental variable.

Addition of New APIs

No new user facing APIs were added, or changed.

Backward compatibility

No backward compatibility concerns, as this feature integration does not change any APIs, or add any new data format.

Performance Considerations

Some environment variables influence the performance of the nGraph-enabled MXNet software and supporting libraries. Here is a partial list of those variables:

Variable	Description
OMP_NUM_THREADS	Suggested value: 16. For more information please see here
KMP_AFFINITY	Suggested value: granularity=fine,compact,1,0. For more information please see here.

Test Plan

This integration runs all MXNet default “tests/python/unittest”(s) using the nGraph backend (via subgraph). All operators that are supported by nGraph are tested through this suite.

The unit test suite provides a large range of test cases for the integrated Subgraph fusion API and nGraph execution. Additionally, a few holes in testing were discovered via model tests, these are covered with additional test coverage under “tests/python/ngraph”.

We enabled mxnet CI/jenkins jobs for this integration with above unit-tests.

Alternative Approaches

We initially considered non-Subgraph based integration, but it involved changes to MXNet core graph executor. Subgraph APIs were designed to address this issue, we believe this fits into our integration nicely.

Milestones

The nGraph library supports a number of hardware and software backends, including "CPU", "INTERPETER" (reference kernels), "GPU", "IntelGPU", etc. Current experimental integration enables "CPU" backend by default. More backends will be supported in future releases.