Thrift API

This page discusses the Thrift client API for Hbase. Thrift is both cross-platform and more lightweight than REST for many operations.

The latest version of the Hbase Thrift API is described by Hbase.thrift.

Using the API

Generating a Thrift client package

Once Thrift is installed, use:

thrift --gen [lang] [hbase-root]/src/java/org/apache/hadoop/hbase/thrift/Hbase.thrift

lang should be one of java, cpp, rb, py, perl or another language listed in Hbase.thrift.

This will produce a directory called gen-py, gen-rb, etc. containing the appropriate model.

Starting the Thrift server

The Thrift server can be started with:

[hbase-root]/bin/hbase thrift start

Using with Python

See Yann's tutorial (July 2008).

To acquire a Thrift client instance:

from thrift.transport.TSocket import TSocket
from thrift.transport.TTransport import TBufferedTransport
from thrift.protocol import TBinaryProtocol
from hbase import Hbase

transport = TBufferedTransport(TSocket(host, port))
transport.open()
protocol = TBinaryProtocol.TBinaryProtocol(transport)

client = Hbase.Client(protocol)

Use help(client) to view the Python API.

Data Type Spec

This design documentation is now outdated. See Hbase.thrift for an up-to-date API.

This section contains the definitions of Thrift data types needed for communication.

columnDescriptor

Used by getColumnDescriptors. How much information should we expose here?

struct columnDescriptor {
  1:string name,
  2:int32 maxVersions,
  3:bool compression
}

regionDescriptor

Used by getTableRegions.

struct regionDescriptor {
  1:string startKey,
  2:string host
}

mutation

Used when performing batch update operations. isDelete is the switch you flip when you want to delete a cell.

struct mutation {
  1:bool isDelete="false",
  2:string columnName,
  3:string value
}

Method Spec

This section contains the definition of the methods we want to expose to clients. If you have a method to propose, add it to the appropriate subsection below along with a comment explaining why we should want such a method.

Meta-info methods

Get Table Names

Returns a list of table names.

list<string> getTableNames()

Get Column Descriptors

Return a list of column descriptors for a given table.

list<columnDescriptor> getColumnDescriptors(string tableName)

Get Table Regions

Return a list of the region and host tuples that make up a table.

list<regionDescriptor> getTableRegions(string tableName)

Row methods

Get Row

Retrieve a map<col name, value> for a given row, with all the usual options. (timestamp, selected columns)

map<string, string> getRow(string tableName, string row),
map<string, string> getRow(string tableName, string row, i64 timestamp),
map<string, string> getRow(string tableName, string row, list<string> columns),
map<string, string> getRow(string tableName, string row, list<string> columns, i64 timestamp)

Mutate Row (Put)

Send a series of mutation commands (put, delete) to the table.

void mutateRow(string tableName, string row, list<mutation> mutations),
void mutateRow(string tableName, string row, list<mutation> mutations, i64 timestamp)

Delete Row

Delete an entire row.

void deleteRow(string tableName, string row),
void deleteRow(string tableName, string row, i64 timestamp)

Scanner methods

Open Scanner

Create a scanner for a table with some options.

i32 openScanner(string tableName, string startRow),
i32 openScanner(string tableName, string startRow, string stopRow),
i32 openScanner(string tableName, string startRow, string stopRow, list<string> columns)

Get Scanner Results

Retrieve one or more records from the scanner at once.

map<string, string> getScannerResult(i32 scannerID)

Close Scanner

Close a scanner.

void closeScanner(i32 scannerID)

Comments on API design

Bryan, I think you may want to use the "binary" type instead of "string" to avoid any possible encoding issues. "binary" is a raw raw byte[] in Java. -- Chad

Bryan, I've been working on various Thrift servers and clients (mainly C++ and Ruby) at Powerset and will be taking a look at creating a Thrift server implementation of this API. Is this something on which you are actively working? If not, I'll take a look at the REST server code as a model for hooking up the Thrift API to Hbase. Other than that, without knowing a bit more about the HBase API, the Thrift API looks good. One thing that we'll need to add to it are Exception declarations. -- DavidSimpson

Using the thrift API presumes that the system is already running the thrift servlet. How does one get that started? -- JimRWilson 2008-04-01 15:56:33

Jim, to start a thrift server, do ${HBASE_HOME}/bin/hbase thrift start or ${HBASE_HOME}/bin/start-daemon.sh start thrift if you want logs captured and a pid file written (FYI, this kind of question belongs over in the hbase mailing list -- no one reads the wiki; smile).

Hbase/ThriftApi (last edited 2009-11-05 22:32:57 by LarsFrancke)