Abstract

OpenDAL means “Open Data Access Layer”. It’s a Rust library that helps developers access data freely, painlessly, and efficiently over multiple services, including AWS S3, HDFS, POSIX-compatible file systems, and so on.

Proposal

OpenDAL provides the following features to support developers accessing data freely, painlessly, and efficiently:  

  • Freely
    • Access different storage services in the same way
    • Behavior tests for all services
    • Cross-language/project bindings (working in progress)
  • Painlessly
    • 100% of documents covered
    • Powerful Layers
    • Automatic retry support
    • Full observability support: logging, tracing, and metrics.
    • Native chaos testing
    • Native service-side encryption support
  • Efficiently
    • Zero cost: mapping to underlying API calls directly
    • Best effort: auto-pick the best read/seek/next implementations based on services
    • Auto metadata reuse: avoid extra metadata calls

OpenDAL was originally designed to be used by the Databend project but is now being used by Mozilla's sccache, DeepETH's mars, and several database startups.

We believe that the OpenDAL project will provide diversity value to the community if OpenDAL is brought into the Apache incubator.

Background

OpenDAL is being developed by an open-source community from day one and is owned by DatafuseLabs. The project has been launched in February 2022.

Rationale

OpenDAL provides a unified storage abstraction layer that simplifies the interfacing of different storage services. In addition, OpenDAL provides further advanced storage encapsulation, enabling enhancements such as automatic retry, request optimization, and observability. OpenDAL makes it possible to develop once and run on any storage service.

Initial Goals

By transferring ownership of the project to the ASF, OpenDAL expects to ensure its neutrality and further encourage and facilitate the adoption of OpenDAL by the community.

Some of the areas we would like to focus on during the Apache incubation phase include:

  • A healthier community: more maintainers and contributors will be able to participate in OpenDAL and own different modules.
  • Wider adoption: OpenDAL can be adopted by more open source/commercial projects, which in turn drives its own functionality.
  • Richer integration: OpenDAL enables greater integration of storage services and offers a wider range of language bindings.

Current Status

Meritocracy

We intend to radically expand the initial developer and user community by running the project the 'Apache way'. Users and new contributors will be respected and welcomed. They will earn credit by participating in the community and providing quality patches/support to move the project forward. They will also be encouraged to provide non-code contributions (documentation, events, community management, etc.) and will be rewarded accordingly. Those with a proven track record of support and quality will be encouraged to become committers.

Community

Contributors: 37

Users:

  • Databend: A cloud data warehouse
  • GreptimeDB: A time-series database
  • Sccache: ccache with cloud storage
  • RisingWave: A Distributed SQL Database for Stream Processing

Core Developers

The core developers are all experienced open-source developers. They have been running the OpenDAL community for 1 year.

Alignment

Known Risks

Project Name

We have checked and believe that the name is appropriate and that the project has legal permission to continue using its current name. There are no other projects with this name found in a Google search.

Orphan Products

Inexperience with Open Source

OpenDAL's core developers are all experienced open source contributors, and its main maintainer Xuanwo has 10 years of open source experience, having worked on a number of open source projects including Hexo, TiDB, TiKV, Databend, Sccache, and others.

Length of Incubation

Expect to enter incubation in two months and graduate in about two years.

Homogenous Developers

OpenDAL developers come from a variety of backgrounds and contribute to the OpenDAL project for different usage scenarios.

Reliance on Salaried Developers

Most developers are paid by their employers to contribute to this project. It's a big risk indeed, we expect to attract more maintainers and contributors from outside the DatafuseLabs to address this.

Relationships with Other Apache Products

  • OpenDAL can be used to operate files such as parquet, avro, …
  • OpenDAL provides a shim that can be used with arrow object_store

An Excessive Fascination with the Apache Brand

Documentation

The document of OpenDAL is hosted at https://opendal.databend.rs/. And opendal’s document is self-contained, all its current and historical versions could be found at https://docs.rs/opendal/latest/opendal/.

Initial Source

The project currently holds a GitHub repository and a Cargo crate:

The crate will retain its name, while the repository will be moved to apache org, and the website will be permanently moved to opendal.apache.org if the proposal gets accepted.

Source and Intellectual Property Submission Plan

External Dependencies

Generated by cargo deny list

Cryptography

N/A

Required Resources

Mailing Lists

Subversion Directory

N/A

Git Repositories

From https://github.com/datafuselabs/opendal

Issue Tracking

The community would like to continue using GitHub Issues.

Other Resources

The community has already chosen GitHub actions as continuous integration tools.

Initial Committers

Sponsors

Champion

  • tison [tison@apache.org]

Nominated Mentors

  • hexiaoqiao [hexiaoqiao@apache.org]
  • ningjiang [ningjiang@apache.org]
  • tedliu [tedliu@apache.org]
  • tison [tison@apache.org]
  • wusheng [wusheng@apache.org]

The Incubator

  • No labels