Notes on the HQL replacement. See HBASE-487 for some context.


  1. At least the admin (definitional, DDL) functionality currently in HQL: SHOW (tables), DROP, CREATE, ALTER. We don't need JAR (Running MR job jar from HQL cmdline), FS (hadoop fs operations from the HQL cmdline), CLEAR (clear terminal).
  2. At least the manipulative functionality (DML) currently in HQL: SELECT, INSERT, UPDATE, DELETE
  3. Output formatters. At least ascii (table) and xhtml. JSON would be a nice-to-have.
  4. User-friendly: 'obvious', 'natural', and lots of help (Hard to have 'fit' criteria for 'user-friendly' but HQL being SQL-like is an example of this requirements' intent)
  5. Read commands from STDIN, dump on STDOUT.
  6. Dynamic language -- python, ruby, groovy, etc. -- access to full HBase API as a tool for debugging horked hbase clusters.

Nice to Haves

  1. HBase particular operators: ONLINE/OFFLINE/MERGE
  2. Our replacement should map closely to current client API
  3. Easy to maintain/extend (Hard to have 'fit' criteria for the notion 'easy')

Some Discussion

We might take on SQLs DDL/DML distinction (Was raised when suggested that DELETE could operate on a cell, column, column family, row, or table depending on context).

Create table needs to take table name, table attributes -- e.g. table regionsize -- and column families and their definitions which will include maximum versions, etc. Attributes on tables and column families are many and will likely evolve over time. Shouldn't have to rev. the shell parser for every attribute change. Building these lengthy DDL statements can be involved and error-prone. Parse failures need to be non-cryptic. Same table and column family descriptors will be used altering table and column families.

Typing 'help', you should get a dump of all thats possible in the hbase shell. Should also be able to do help per command and, dependent on how we implement, do help or describe of an object to learn what the object exposes.

Its OK that a user might mistakenly run 'select * from TABLE_WITH_1B_ROWS'. They won't do it a second time. A simple search should turn up pointers out of the shell to tools of our manufacture -- MR tools -- or to PIG/JAQL/Cascading.

Hbase/Shell/Replacement (last edited 2009-09-20 23:54:49 by localhost)