Apache Atlas


 

Atlas provides open metadata management and governance capabilities for organizations that are using data intensive platforms such as Apache Hadoop, cloud platforms, mobile and IoT systems that all need to be integrated with their traditional systems to exchange data for analytics and data driven-decisions.  Through these capabilities, an organization can build a catalog of their data assets, classify and govern these assets and provide collaboration capabilities around these data assets for data scientists, analysts and the data governance team.

 


Why Atlas?

Atlas targets a scalable and extensible set of core foundation metadata management and governance services – enabling enterprises to effectively and efficiently meet their compliance requirements on individual data platforms while ensuring integration with the whole data ecosystem. Apache Atlas is organized around two guiding principals:

 


Atlas today

Figure 1 below show the initial architecture proposed for Apache Atlas as it went into the incubator.

 

Figure 1: the initial vision for Apache Atlas

 

The core capabilities defined by the incubator project included the following:

The Atlas community has delivered those requirements with the following components:

  1. Flexible knowledge store and type system
  2. Automatic cataloguing of data assets and lineage through hooks and bridges
  3. APIs and a simple UI to provide access to the metadata
  4. Integration with Apache Ranger to add real-time, tag-based access control to Ranger’s already strong role-based access control capabilities.

Stay Tuned for More to Come

Atlas today focuses on the Apache Hadoop platform.  However, at its core, Atlas is designed to exchange metadata with other tools and processes within and outside of the Hadoop ecosystem, thereby enabling platform-agnostic governance controls that effectively address compliance requirements.

The projects underway today will expand both the platforms it can operate on, its core capabilities for metadata discovery and governance automation as well as creating an open interchange ecosystem of message exchange and connectors to allow different instances of Apache Atlas and other types of metadata tools to integrate together into an enterprise view of an organization's data assets, their governance and use. 

Atlas is only as good as the people who are contributing.  If metadata management and governance is an area of interest or expertise four you then please consider becoming part of the Atlas community and Getting Involved.