Hadoop FileSystem Validation Workstream
Hadoop has a pluggable FileSystem Architecture. 3rd party FileSystems can be enabled for Hadoop by developing a plugin that mediates between the Hadoop FileSystem Interface and the interface of the 3rd Party FileSystem. For those developing a Hadoop FileSystem plugin, there is no comprehensive test library to validate that their plugin creates a Hadoop FileSystem implementation that is Hadoop compatible.
What do we mean by comprehensive? We mean that there is a test for every single operation in the FS Interface that properly tests the expected behavior of the operation given the full variability of its parameters. To create a comprehensive test library, we plan to do the following:
* Focus on the Hadoop 2.0 FS Interface. If possible, create a work stream that would allow testing and validation of the FS 1.0 Interface also.
- This includes an audit of the new Hadoop FS Tests added by Steve Loughran for his Hadoop FS Plugin for SWIFT
* Create tests to fill in the gaps
- Also, create a test strategy for handling Object/Block Stores as Hadoop FileSystems
Once the comprehensive test library is complete, it can then be used by the provider of a 3rd Party FileSystem to verify compatibility with Hadoop by:
- - Passing Functional Validation: Successfully passing the test library that will be created (described above)
- Passing Ecosystem Validation: Successfully passing the Hadoop Integration Tests from Apache BigTop
June 25th 2013 - Face to Face meeting at Red Hat in Mountain View. The day before Hadoop Summit. Details/Sign up here - http://hadoop-fs.eventbrite.com/
Work thus far
June 10th 2013 9AM PST via Google Hangout
Attendees: Tim St. Clair, Matt Farrellee, Steve Watt, Jay Vyas, Steve Loughran, Sanjay Radia, Andrew Purtell, Joe Buck, Roman Shaposhnik, Nathan (?)
- Discussion of the goals of the work
- Steve Loughran to give an update on the Hadoop FS Tests he developed for SWIFT
- Discussion on where people would like to participate
- Validation of the current goals, plus the addition of:
- Create a workstream to identify if Object/Blob stores have unique properties that make them a special case for Test Coverage as a Hadoop FS. Create a strategy for handling Object/Block Stores.
Create a Hadoop 2.0 FileSystem Interface Specification for developers creating plugins as well as additional background for interested users. This should be created as a JavaDoc and managed in JIRA so that it supports proper governance.
The workstream definition at the top of this page has been updated to reflect the new additions to the initiative.
June 4th 2013
May 23rd 2013 - A broader call for participation was made to the hadoop-core dev proposing:
* broader participation in defining the expected behavior of Hadoop FileSystem operations
* creating a comprehensive test suite verifying compliance with the expected behavior of a Hadoop FileSystem
* several google hangouts and a workshop to discuss the topics
The following parties responded that they were interested in participation: - email@example.com, firstname.lastname@example.org, email@example.com, firstname.lastname@example.org, email@example.com, firstname.lastname@example.org, email@example.com, firstname.lastname@example.org, email@example.com, firstname.lastname@example.org