This document describes how to get Nutch to use HBase as a backend for GORA and is based on the revision 993857 of the Nutch trunk

<property>
 <name>storage.data.store.class</name>
 <value>org.apache.gora.hbase.store.HBaseStore</value>
 <description>Default class for storing data</description>
</property>

Note: Currently HBaseStore is NOT YET THREAD-SAFE, so all processes should have single threaded settings (i.e. set number of fetchers to 1). Work to make it thread-safe is in progress.

You should then be able to use it. Try going to $NUTCH_HOME/runtime/local/bin and do :

  nutch inject /someseedDir
  nutch readdb

You should find more details in the logs on $NUTCH_HOME/runtime/local/logs/hadoop.log

GORA_HBase (last edited 2011-10-31 12:15:48 by FerdyGalema)