Google Groups Home
Help | Sign in
Message from discussion BloomFilters
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
Doug Judd  
View profile
 More options May 1, 12:19 pm
From: "Doug Judd" <d...@zvents.com>
Date: Thu, 1 May 2008 09:19:14 -0700
Local: Thurs, May 1 2008 12:19 pm
Subject: Re: [hypertable-dev] BloomFilters

Hi Naveen,

This looks fantastic!  I think the best thing to do is to make the size of
the bloom filter variable depending on the number of keys that it covers.
You can imagine that an access group that stores columns of 4-byte integers
are going to have many more keys that an access group that contains crawl
data, for example.  That would optimized storage and allow for a consistent
error rate.

You might want to have the bloom filter configurable at the schema level.  I
would opt for having them on by default, but maybe there might be a need to
disable them in certain scenarios.  Maybe an error rate or bits/key tuning
option might be useful as well.

- Doug

On Thu, May 1, 2008 at 2:14 AM, Naveen Koorakula <nave...@gmail.com> wrote:
> Hello Doug, Luke,
> Just wrote up a design spec for using BloomFilters in CellStores to reduce
> disk accesses when the key(s) being queried for are specified by the query.
> Please could you take a look and send me any comments / suggestions ?

> http://code.google.com/p/hypertable/wiki/BloomFilters

> One decision I left open is whether the usage of bloom filters should be
> configurable at a schema level. Any opinions ?

> Thanks,
> --Naveen


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.

Create a group - Google Groups - Google Home - Terms of Service - Privacy Policy
©2008 Google