Comparison with bitcask

55 views
Skip to first unread message

David Yu

unread,
Oct 18, 2010, 6:07:05 AM10/18/10
to kr...@googlegroups.com
Hi,

A post from Alex Feinberg from quora:

Bitcask is effectively "log structured linear hashing" and is used by Riak (another Dynamo-like key/value store). It uses the operating system page cache rather than a built in cache, thus being friendly to garbage collection in Erlang (language that Riak is implemented in):

http://bitbucket.org/basho/bitcask

Krati is a similar concept to Bitcask (again, append-only linear hashing), but is implemented on the JVM and available as a storage engine for Voldemort:

http://sna-projects.com/krati/


Bitcask requires all the keys to fit in memory.
MemorySegment requires all keys and values to fit in memory.
Does krati also require all keys to fit in memory when using MappedSegement/ChannelSegment?

Thanks

--
When the cat is away, the mouse is alone.
- David Yu

Avatar

unread,
Oct 19, 2010, 6:40:49 AM10/19/10
to Krati
Hi David,

This depends on which DataStore and Segment you are using.

For both StaticDataStore and DynamicDataStore, if MemoreSegment is
used, keys and values must fit into main memory.

For IndexedDataStore, you can specify which SegmentFactory is used for
index and store separately. If MemorySegmentFactory is used for index
and WriteBufferSegementFactory for store, then only keys are required
to fit into main memory.

Thanks.

jingwei

David Yu

unread,
Oct 19, 2010, 10:50:20 AM10/19/10
to kr...@googlegroups.com
On Tue, Oct 19, 2010 at 11:40 AM, Avatar <jingw...@gmail.com> wrote:
Hi David,

This depends on which DataStore and Segment you are using.

For both StaticDataStore and DynamicDataStore, if MemoreSegment is
used, keys and values must fit into main memory.

For IndexedDataStore, you can specify which SegmentFactory is used for
index and store separately. If MemorySegmentFactory is used for index
and WriteBufferSegementFactory for store, then only keys are required
to fit into main memory.
Thanks for the explanation.  
You mentioned on the other thread that when mapped segment is used, the dataset should not be over 2x the physical memory to have good perf.  Do you have the ratio when using the IndexedDataStore approach you just mentioned?

Thanks

Avatar

unread,
Oct 19, 2010, 11:14:42 AM10/19/10
to Krati
Hi David,

It is difficult to give a meaningful ratio as read access may show
locality characteristics. The 2X for memory map is merely an
observation for purely random reads and writes. In general, the more
read/write shows locality, the larger the ratio.

It is always good to find out what is the true ratio of your own
application.

Thanks.

-jingwei

On Oct 19, 7:50 am, David Yu <david.yu....@gmail.com> wrote:
Reply all
Reply to author
Forward
0 new messages