Accessing HBase trunk from JRuby 1.1.x


This page is mostly obsolete since jirb -- the JRuby IRB -- with hbase additions became the native shell in hbase. See HBase Shell


This page describes the process of connecting to HBase from JRuby. The code mostly follows the Can someone give an example of basic API-usage going against hbase? example listed in the HBase FAQ.

Start HBase

bin/hbase master start

Get JRuby 1.1.1

curl -O http://repo1.maven.org/maven2/org/jruby/jruby-complete/1.1.1/jruby-complete-1.1.1.jar

-or-

wget http://repo1.maven.org/maven2/org/jruby/jruby-complete/1.1.1/jruby-complete-1.1.1.jar

Set your CLASSPATH

export CLASSPATH=`java -jar jruby-complete-1.1.1.jar -e "puts Dir.glob('{.,build,lib}/*.jar').join(':')"`

The Code

Once you've got that set it's as simple as just translating the Java on the FAQ page to JRuby.

The code below creates a table, puts some data in it, fetches that data back out and then deletes the table.

Save and run the example as "helloworld.rb" and run it with 'jruby' from the base of HBase svn trunk.

java org.jruby.Main helloworld.rb

or open an interactive shell 'jirb' and paste it in

java org.jruby.Main -S jirb

include Java
import org.apache.hadoop.hbase.HBaseConfiguration
import org.apache.hadoop.hbase.HColumnDescriptor
import org.apache.hadoop.hbase.HConstants
import org.apache.hadoop.hbase.HTableDescriptor
import org.apache.hadoop.hbase.client.HBaseAdmin
import org.apache.hadoop.hbase.client.HTable
import org.apache.hadoop.hbase.io.BatchUpdate
import org.apache.hadoop.io.Text

conf = HBaseConfiguration.new
tablename = "test"
tablename_text = Text.new(tablename)
desc = HTableDescriptor.new(tablename)
desc.addFamily(HColumnDescriptor.new("content:"))
desc.addFamily(HColumnDescriptor.new("anchor:"))
admin = HBaseAdmin.new(conf)
if admin.tableExists(tablename_text)
  admin.disableTable(tablename_text)
  admin.deleteTable(tablename_text) 
end
admin.createTable(desc)
tables = admin.listTables
table = HTable.new(conf, tablename_text)
row = Text.new("row_x")
b = BatchUpdate.new(row)
b.put(Text.new("content:"), "some content")
table.commit(b)
data = table.get(row, Text.new("content:"))
data_str = java.lang.String(data, "UTF8")
print "The fetched row contains the value '#{data_str}'"
admin.deleteTable(desc.getName)

Scanning Script

Here's a bit of script to return rows between the start of table TestTable and an end row of '0000000003':

{{{c = HBaseConfiguration.new() t = HTable.new(c, "TestTable") columns = ["info:"].to_java(java.lang.String) s = t.getScanner(columns, "", "0000000003", HConstants::LATEST_TIMESTAMP).iterator() while s.hasNext() do

end}}}

Here's how to run it:

$ ./bin/hbase shell /tmp/scan.rb

Only works for TRUNK as of 06/05/2008.

Hbase/JRuby (last edited 2009-09-20 23:54:42 by localhost)