Differences between revisions 8 and 9
Revision 8 as of 2008-02-17 11:12:04
Size: 4990
Comment:
Revision 9 as of 2009-09-20 23:54:24
Size: 5038
Editor: localhost
Comment: converted to 1.6 markup
Deletions are marked like this. Additions are marked like this.
Line 9: Line 9:
   * [http://www.zlib.net zlib] [http://hadoop.apache.org/core/api/org/apache/hadoop/io/compress/CompressionCodec.html compression codec] - Reworked zlib codec using nio's direct-buffers which gives us 60%-70% speedup. (more details [http://issues.apache.org/jira/browse/HADOOP-538 here])
   * [http://www.oberhumer.com/opensource/lzo/ lzo] [http://hadoop.apache.org/core/api/org/apache/hadoop/io/compress/CompressionCodec.htmlapi/org/apache/hadoop/io/compress/CompressionCodec.html compression codec] - Implemented due to lack of java bindings for lzo . (more details [http://issues.apache.org/jira/browse/HADOOP-851 here])
   * [[http://www.zlib.net|zlib]] [[http://hadoop.apache.org/core/api/org/apache/hadoop/io/compress/CompressionCodec.html|compression codec]] - Reworked zlib codec using nio's direct-buffers which gives us 60%-70% speedup. (more details [[http://issues.apache.org/jira/browse/HADOOP-538|here]])
   * [[http://www.oberhumer.com/opensource/lzo/|lzo]] [[http://hadoop.apache.org/core/api/org/apache/hadoop/io/compress/CompressionCodec.htmlapi/org/apache/hadoop/io/compress/CompressionCodec.html|compression codec]] - Implemented due to lack of java bindings for lzo . (more details [[http://issues.apache.org/jira/browse/HADOOP-851|here]])
Line 15: Line 15:
   * Take a look at the [#Platforms supported platforms]
   * Either [http://www.apache.org/dyn/closer.cgi/lucene/hadoop/ download] the prebuilt '''32-bit i386-Linux''' native-hadoop libraries (available as part of hadoop distribution in ''lib/native'') or [#BuildNativeHadoop build] them yourself.
   * Take a look at the [[#Platforms|supported platforms]]
   * Either [[http://www.apache.org/dyn/closer.cgi/lucene/hadoop/|download]] the prebuilt '''32-bit i386-Linux''' native-hadoop libraries (available as part of hadoop distribution in ''lib/native'') or [[#BuildNativeHadoop|build]] them yourself.
Line 23: Line 23:
[[BR]] <<BR>>
Line 28: Line 28:
[[BR]] <<BR>>
Line 33: Line 33:
[[Anchor(Platforms)]] <<Anchor(Platforms)>>
Line 36: Line 36:
Native-hadoop library is supported for '''*nix''' platforms only. Unfortunately it is known ''not to work'' on [http://www.cygwin.com Cygwin] and [http://www.apple.com/macosx Mac OS X] and has mainly been used on the Linux platform. [http://wiki.apache.org/hadoop/HowToContribute Patches] from anyone interested in getting them working on Cygwin/MacOSX are welcome! Native-hadoop library is supported for '''*nix''' platforms only. Unfortunately it is known ''not to work'' on [[http://www.cygwin.com|Cygwin]] and [[http://www.apple.com/macosx|Mac OS X]] and has mainly been used on the Linux platform. [[http://wiki.apache.org/hadoop/HowToContribute|Patches]] from anyone interested in getting them working on Cygwin/MacOSX are welcome!
Line 39: Line 39:
   * [http://www.redhat.com/rhel/ RHEL4]/[http://fedora.redhat.com/ Fedora]
   * [http://www.ubuntu.com Ubuntu]
   * [http://www.gentoo.org Gentoo]
   * [[http://www.redhat.com/rhel/|RHEL4]]/[[http://fedora.redhat.com/|Fedora]]
   * [[http://www.ubuntu.com|Ubuntu]]
   * [[http://www.gentoo.org|Gentoo]]
Line 45: Line 45:
[[Anchor(BuildNativeHadoop)]] <<Anchor(BuildNativeHadoop)>>
Line 48: Line 48:
Native-hadoop library is written in [http://en.wikipedia.org/wiki/ANSI_C ANSI C] and built using the GNU autotools-chain (autoconf, autoheader, automake, autoscan, libtool). This means it should be straight-forward to build them on any platform with a standards compliant C compiler and the GNU autotools-chain. (See [#Platforms supported platforms]) Native-hadoop library is written in [[http://en.wikipedia.org/wiki/ANSI_C|ANSI C]] and built using the GNU autotools-chain (autoconf, autoheader, automake, autoscan, libtool). This means it should be straight-forward to build them on any platform with a standards compliant C compiler and the GNU autotools-chain. (See [[#Platforms|supported platforms]])
Line 51: Line 51:
   * C compiler (e.g. [http://gcc.gnu.org/ GNU C Compiler]    * C compiler (e.g. [[http://gcc.gnu.org/|GNU C Compiler]]
Line 53: Line 53:
      * [http://www.gnu.org/software/autoconf/ autoconf]
      * [http://www.gnu.org/software/automake/ automake]
      * [http://www.gnu.org/software/libtool/ libtool]
   * [http://www.zlib.net/ zlib-development package] (stable version >= 1.2.0)
   * [http://www.oberhumer.com/opensource/lzo/ lzo-development package] (stable version >= 2.0)
      * [[http://www.gnu.org/software/autoconf/|autoconf]]
      * [[http://www.gnu.org/software/automake/|automake]]
      * [[http://www.gnu.org/software/libtool/|libtool]]
   * [[http://www.zlib.net/|zlib-development package]] (stable version >= 1.2.0)
   * [[http://www.oberhumer.com/opensource/lzo/|lzo-development package]] (stable version >= 2.0)
Line 60: Line 60:
[[BR]] <<BR>>
Line 66: Line 66:
[[BR]] <<BR>>

Overview

Hadoop has native implementations of certain components for reasons of both performace & non-availability of java-implementations in a single dynamically linked native-hadoop library. On the *nix platform it is libhadoop.so. This section describes the usage & details on how to build the native-libraries.

Native Hadoop Libraries

Hadoop has the following native components:

Usage

It is fairly simple to use the native-hadoop libraries:

  • Take a look at the supported platforms

  • Either download the prebuilt 32-bit i386-Linux native-hadoop libraries (available as part of hadoop distribution in lib/native) or build them yourself.

  • Ensure you have either or/both of >zlib-1.2 and >lzo2.0 packages for your platform installed; depending on your needs.

That's it!

The bin/hadoop script ensures that the native-hadoop library is on the library path via the system property -Djava.library.path= (another alternative is to use the LD_LIBRARY_PATH variable, but the former is recommended).

To check everything went alright check the hadoop log files for:
{{{DEBUG util.NativeCodeLoader - Trying to load the custom-built native-hadoop library... INFO util.NativeCodeLoader - Loaded the native-hadoop library}}}

God-forbid something goes wrong, then:
{{{INFO util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable }}} If so, recheck the above steps.

Supported Platforms

Native-hadoop library is supported for *nix platforms only. Unfortunately it is known not to work on Cygwin and Mac OS X and has mainly been used on the Linux platform. Patches from anyone interested in getting them working on Cygwin/MacOSX are welcome!

It has been tested on the following Linux distributions:

On all the above platforms a 32/64 bit native-hadoop library will work with a respective 32/64 bit jvm.

Building Native Hadoop Libraries

Native-hadoop library is written in ANSI C and built using the GNU autotools-chain (autoconf, autoheader, automake, autoscan, libtool). This means it should be straight-forward to build them on any platform with a standards compliant C compiler and the GNU autotools-chain. (See supported platforms)

In particular the various packages you would need on the target platform are:

Once you have the pre-requisites use the standard build.xml and pass along the compile.native flag (set to true) to build the native-hadoop library:
{{{$ ant -Dcompile.native=true <target> }}} The native-hadoop library is not built by default since not everyone is interested in using them.

That's it! You should see the newly-built native-hadoop library in:
{{{$ build/native/<platform>/lib }}} where <platform> is combination of the system-properties: {os.name}-{os.arch}-{sun.arch.data.model}; for e.g. Linux-i386-32

Notes:

  • It is mandatory to have both the zlib and lzo development packages on the target platform for building the native-hadoop library; however for deployment it is sufficient to install zlib or lzo if you wish to use only one of them.

  • It is necessary to have the correct 32/64 libraries of both zlib/lzo depending on the 32/64 bit jvm for the target platform for building/deployment of the native-hadoop library.

NativeHadoop (last edited 2009-09-20 23:54:24 by localhost)