Concur Proposal

Abstract

Concur is an open source java implementation for RAFT consensus protocol\[1\]. RAFT is being used successfully as an alternative to Paxos to implement a consistently replicated log. RAFT is proven to be safe and is designed to be simpler to understand.

Proposal

Concur is implemented as an independent RAFT library that can be used by any application to manage their replicated logs or replicated state machines. The implementation closely follows the original design proposed in the RAFT paper.

Background

Replicated log is a frequently used technique in distributed systems for reliability and parallelism. The consistency of replicas is an important requirement to ensure the correctness of the applications accessing the data in these replicas. Many algorithms have been proposed and used to maintain consistency and correctness. RAFT paper was proposed in 2012 and has gained significant popularity since then to manage the replicated logs and replicated state in many popular projects. Data replication is a requirement for many projects in ASF ecosystem for example Apache Hadoop, Apache HBase etc. We believe the Concur project can provide an implementation of RAFT that can fulfill the needs similar to these systems and strengthen the ASF ecosystem.

Rationale

There are a few RAFT implementations\[2\], but none are a part of ASF. Concur is significantly different because one of the critical goals of the project is to use it as a library with pluggability for different RPC, Raft log and state machine implementations. Another important goal of Apache Concur is to provide high throughput for large ingest rates of data, with data pipeline support. This will simplify the problem of data replication across the board. Apache Kudu uses RAFT protocol, but it has its own C++ implementation. Apache [DistributedLog] project\[3\] (in incubation) provides a replicated log service. However, Apache Concur is different as it provides a java library that other projects can use to implement their own replicated state machine, without deploying another service.

Current Status

The source code is available with Apache v2 license at https://github.com/hortonworks/concur. Significant amount of code has been added and basic functionality of RAFT is available for testing. The code is still in pre-alpha stage and is currently being tried out in POC mode.

Meritocracy

We plan to invest in supporting meritocracy. We intend to invite additional developers to participate. We will encourage and monitor community participation so that privileges can be extended to those that contribute.

Community

The developers on the initial committers list are experienced in the ASF ecosystem:

Affiliations

Alignment

We believe that Concur will address the requirements of several projects and communities in the Apache ecosystem and will gain great adoption.

Known Risks

Orphaned Products

The contributors are leading users and vendors in the Apache Hadoop ecosystem, with significant open source experience, so the risk of being orphaned is relatively low. The project could be at risk if vendors decided to change their strategies in the market. In such an event, the current committers plan to continue working on the project on their own time, though the progress will likely be slower. We plan to mitigate this risk by encouraging and recruiting additional committers. Since replicated log is a very common technique useful for large number of applications, we believe many developers would like to join and contribute to the project.

Inexperience with Open Source

The initial committers include veteran Apache Members (Committers, PMC Members and Apache Members) and other developers who have varying degrees of experience with open source projects. All have been involved with source code that has been released under an open source license, and also have experience developing code with an open source development process.

Homogenous Developers

The initial list of committers include senior Apache developers from multiple different organizations including Hortonworks, OfferUp, Intel, Uber etc. We believe this project will benefit many different projects and therefore more and more diverse set of developers will join over time.

Reliance on Salaried Developers

It is expected that Concur development will occur on both salaried time and on volunteer time, after hours. The majority of initial committers are paid by their employer to contribute to this project. However, they are all passionate about the project, and we are confident that the project will continue even if no salaried developers contribute to the project. We are committed to recruiting additional committers including non-salaried developers.

Relationships with Other Apache Products

Most of the initial developers are active participants in Apache Hadoop community and we expect adoption of Concur by Hadoop Community. However, Concur is a generic RAFT library and we will strive to take it to more general adoption. This project depends on Apache Maven, Apache Commons {collections,configuration,io} and Apache Hadoop.

Initial Source

https://github.com/hortonworks/concur

External Dependencies

Required Resources

Mailing List

Git Repository

Git is the preferred source control system: git://git.apache.org/concur

Issue Tracking

JIRA Concur (Concur)

Sponsors

Champion

Jitendra Pandey (jitendra)

Nominated Mentors

Sponsoring Entity

Incubator PMC

Reference