Plan for hbase 0.2.0 version

A meeting to plan hbase development priorities over the next three months, "hbase 0.17", took place January 30th, 2008 at Powerset in San Francisco. Attendees were Chris Kline, Bryan Duxbury, Jim Firby, Michael Bieniosek, Jim Kellerman, Chad Walters, and Michael Stack.

Format was a round of the room giving each of the attendees opportunity to call out the big ticket items that they believed needs addressing over the next phase of development. Thereafter, the meeting degenerated into a huddle around a screen with Bryan, Jim K, and stack entering and jumbling issues to match the agreed upon priorities.

High level, the consensus was that the focus for the next three months should be Robustness and Scalability (In that order).

The priorities identified in this meeting were presented to the HBase community for approval, additions and deletions.

Identified Priorities

Apart from the obvious hadoop issues – i.e. HADOOP-1700 – the group identified the below as priorities to be addressed over the course of the next three months or so. All items make contribution to the banner target of increased "Robustness and Scalability". All items have by now corresponding issues over in JIRA.

  • "Too many open file handles": Fix the hbase file handle gluttony.
  • Need to add an accounting of hbase resource usage – e.g. A Memory Manager – and we need to add means of having the regionserver tell the master "I'm full" and for the master's part, it should not try assigning any more regions to this server.
  • Assignment functions needs work. Fix it so one server sometimes gets 100s of regions while others get two or three only. Also, fix startup sequence. Currently if many regions to assign, there is churn with regions multiply assigned (Usually it settles eventually, but fix the churn).
  • Rebalancing load.
  • Make start/stop reliable. Currently, regionserver goes down though it never talked to the master. Should stay up. Also, hbase should be able to work in absence of hdfs – or at least not freeze when its pulled out from under hbase – and it also needs to be start/stop nicely (as best as it can) independent of the order in which applications are started: e.g. if hbase is started before hdfs, hbase should just gracefully hang till hdfs appears and then continue on its way
  • Means of making the cluser read-only/freeze-it
  • Means of checkpointing/syncing (Perhaps based-off or labelled by timestamp).
  • We've seen hbase corrupt HDFS. Figure out how.

Target is that the next release can handle "3TB of data on about ~50 nodes".

Other items discussed

  • hbase needs a wikipedia page.
  • We should make an hbase blog. It would replace the 'news' section up on hbase wiki
  • Remove deprecated methods.
  • Caching came up but was thought a low priority though it was allowed that since Tom White's work, would take little to add a caching of "hot rows".
  • We took a vote and HBase ( capital B ) overwhelmingly beat Hbase ( little b ) as way to capitalize name of this project

Refactoring Patterns

There was some talk among J, B, and S about refactoring patterns to keep in mind as we hack into the hbase future:

  • Make new subpackages o.a.h.h.regionserver, o.a.h.h.master, and o.a.h.h.client
  • Move inner classes > 20 lines or so out of containing classes into classes of their own. If master, regionserver or client inner classes, new classes should be package private, etc.
  • Take on the Callable pattern introduced by Peter Dolan. There's a load of places where it can be used to get rid of duplicated retry code.
  • No labels