0.21 release update

* Continuing to close blockers, ping people for updates and suggestions * About 20 open blockers. Many are MapReduce documentation that may be pushed. Speak up if 0.21 is missing anything substantive. * Common/HDFS visibility and annotations are close to consensus; MapReduce annotations are committed to trunk and the 0.21 branch

HEP proposal

Slides from the meeting: http://www.slideshare.net/cloudera/hadoop-contributors-meeting-ii

(what follows is the sketch presented at the meeting. A full proposal with concrete details will be circulated on the list)

* Based on- and very similar to- the PEP (Python Enhancement Proposal) Process * Audience is HDFS and MapReduce; not necessarily adopted by other subprojects

innovation/experimentation and stability * Not for small enhancements, features, and bug fixes. This should not slow down typical development or impede casual contribution to Hadoop * Primary mechanism for new features, collecting input, documenting design decisions * JIRA is good for details, but not for deciding on wide shifts in direction * Purpose is for author to build consensus and gather dissenting opinions.

they believe it is a sound idea

to projects. A similar mechanism could work for incoming HEPs

Drafting/baking of the HEP occurs in public on a list dedicated to that particular proposal. Once Editors certify the HEP as complete, it is sent to general@ for wider discussion.

to ensure the HEP is complete enough to present to the community.

posting to the HEP list. Completeness is, of course, subjective. If the Editor and Author disagree whether the proposal affects an aspect of the framework enough to merit special consideration, it is not entirely clear how to resolve the disagreement.

process of Hadoop is not entirely clear. It may be possible to optimize it out.

vote of the PMC (mechanics undefined). In Python, the result is committed to the repository. A similar practice would make sense in Hadoop. * Which issues require HEPs?

al. were examples of features substantial enough to merit a HEP. Pure Java CRC is an example of an enhancement that would not. Whether an explicit process must be in place to determine whether an issue requires a HEP is not clear.

might be more accurate. Going through the HEP process should always improve the chances of a successful proposal

* Evaluation

functionality, technically unsound, insufficiently motivated, no backwards compatibility story, etc.

Feedback is less welcome once code is in hand.

issue, e.g. concerns that the proposal may not scale or may harm performance

a safe practice for the Author to encourage HEP reviewers not to block the product of the proposal.

* The testing burden and completion strategy may be ambiguous

the implementer. Completing the proposal to address all use cases may require considerably more work than the Author is willing or motivated to invest.

objections are merited and reasonable. For example, a particularly obscure/esoteric use case could be included as a condition for acceptance if the dissenter is willing to invest the resources to test/validate it. The process is flexible in this regard.

performance regression, availability, and other considerations need not be called out in every HEP.

should ideally be automated and reproducible in different organizations


* A patch and a branch are isomorphic from a policy perspective. Of course, they are functionally distinct: branches are easier to collaborate on and are, generally, longer-lived than are patches. But special policies need not be derived to account for these differences, which concern the production of the code, not its review and acceptance. * Some developers find branches to be easier to review than very large patches and easier to merge, given a toolchain that supports this.

* Eclipse Labs

its rules for releases, etc.


* Modules (such as fuse-dfs) are not actively maintained in the main repository and would benefit from a release schedule decoupled from the rest of Hadoop * With few exceptions, the contrib modules have smaller, often discrete groups of maintainers. It may be worth exploring whether these projects could live elsewhere