Differences between revisions 7 and 8
Revision 7 as of 2009-05-06 20:54:26
Size: 980
Editor: stack
Comment: Bloom filters will be back in 0.21 (after chatting w/ other lads over in hbase)
Revision 8 as of 2009-09-20 23:54:36
Size: 984
Editor: localhost
Comment: converted to 1.6 markup
Deletions are marked like this. Additions are marked like this.
Line 7: Line 7:
Bloom filters can be enabled on a per-column family basis in HBase. By specifying true for the bloom filter parameter in the constructor for H!ColumnDescriptor or by calling H!ColumnDescriptor.setBloomFilter(true), a [http://portal.acm.org/citation.cfm?id=362692&dl=ACM&coll=portal bloom filter] as defined by Bloom in 1970 will be added to the column family. Bloom filters can be enabled on a per-column family basis in HBase. By specifying true for the bloom filter parameter in the constructor for H!ColumnDescriptor or by calling H!ColumnDescriptor.setBloomFilter(true), a [[http://portal.acm.org/citation.cfm?id=362692&dl=ACM&coll=portal|bloom filter]] as defined by Bloom in 1970 will be added to the column family.
Line 11: Line 11:
Bloom filters are created using the mechanism specified by [http://www.eecs.harvard.edu/~michaelm/NEWWORK/postscripts/BloomFilterSurvey.pdf Broder and Mitzenmacher] which computes the vector size using 4 hash functions. Bloom filters are created using the mechanism specified by [[http://www.eecs.harvard.edu/~michaelm/NEWWORK/postscripts/BloomFilterSurvey.pdf|Broder and Mitzenmacher]] which computes the vector size using 4 hash functions.

Current

Bloom filters didn't work reliably in 0.19.x and are a noop in 0.20.x. They should make a re-appearance in 0.21.x HBase.

Historically

Bloom filters can be enabled on a per-column family basis in HBase. By specifying true for the bloom filter parameter in the constructor for H!ColumnDescriptor or by calling H!ColumnDescriptor.setBloomFilter(true), a bloom filter as defined by Bloom in 1970 will be added to the column family.

This can be done either at table creation time or by disabling the table and modifying the column through the H!BaseAdmin.modifyColumn API.

Bloom filters are created using the mechanism specified by Broder and Mitzenmacher which computes the vector size using 4 hash functions.

Junit testing for bloom filters can be found in hbase.regionserver.TestBloomFilters.

Hbase/UsingBloomFilters (last edited 2009-09-20 23:54:36 by localhost)