PerformanceKrati vs HashMap

100 views
Skip to first unread message

Otis

unread,
Oct 16, 2010, 2:46:23 PM10/16/10
to Krati
Hi,

Has anyone measured and compared performance of Krati vs. HashMap?

Thanks,
Otis

Tatu Saloranta

unread,
Oct 16, 2010, 3:17:51 PM10/16/10
to kr...@googlegroups.com
On Sat, Oct 16, 2010 at 11:46 AM, Otis <otis.gos...@gmail.com> wrote:
> Hi,
>
> Has anyone measured and compared performance of Krati vs. HashMap?

Would this be a useful comparison, given that HashMap is in-memory,
Krati disk backed?

For what it's worth, a project I am working on, tr13, can be used as a
replacement for HashMap -- it is in-memory only (although using
memory-mapped files might be easy to do) for what that's worth. And
someone did compare its performance; HashMap is faster, since tr13
optimizes for compactness and very low GC activity (single byte array
or ByteBuffer); see this discussion:
http://groups.google.com/group/ning-tr13-users/browse_thread/thread/a48bfd80a6880748

But tr13 is read-only (build once) data structure as well which limits
its use cases. That's why something similar to Krati could be useful,
allowing modifications, but still using chunk-based storage (from
which to deserialize as necessary, but outside of storage engine). But
it looks like Krati implementation does not allow in-memory-only mode,
which I would want?
It would be nice if that was available, but it probably would not be
easy to implement.

-+ Tatu +-

Avatar

unread,
Oct 17, 2010, 12:39:41 AM10/17/10
to Krati
We have not done any performance comparison of Krati vs. HashMap as
the persistency backed by disk files is one of major design goals of
Krati.

Avatar

unread,
Oct 17, 2010, 12:46:10 AM10/17/10
to Krati
Hello Tatu,

Actually, it is relatively simple to support the in-memory only mode
in Krati. All it takes is to implement a new segment that is not
backed by disk files for persistency. The current MemorySegment class
is a good place to look at for achieving such a purpose.

Thanks

On Oct 16, 12:17 pm, Tatu Saloranta <tsalora...@gmail.com> wrote:
> On Sat, Oct 16, 2010 at 11:46 AM, Otis <otis.gospodne...@gmail.com> wrote:
> > Hi,
>
> > Has anyone measured and compared performance of Krati vs. HashMap?
>
> Would this be a useful comparison, given that HashMap is in-memory,
> Krati disk backed?
>
> For what it's worth, a project I am working on, tr13, can be used as a
> replacement for HashMap -- it is in-memory only (although using
> memory-mapped files might be easy to do) for what that's worth. And
> someone did compare its performance; HashMap is faster, since tr13
> optimizes for compactness and very low GC activity (single byte array
> or ByteBuffer); see this discussion:http://groups.google.com/group/ning-tr13-users/browse_thread/thread/a...

Otis

unread,
Oct 17, 2010, 8:38:54 AM10/17/10
to Krati
Hi,

I thought the comparison might make sense because, it is my
understanding, Krati is smart about in-memory and on-disk storage
which, I thought, is one of the reasons its an improvement over BDB
JE. I could have things wrong! :)

Thanks,
Otis
P.S.
Tatu, we have some crazy mailing list overlap - wherever I go I see
your name. :)

On Oct 16, 3:17 pm, Tatu Saloranta <tsalora...@gmail.com> wrote:
> On Sat, Oct 16, 2010 at 11:46 AM, Otis <otis.gospodne...@gmail.com> wrote:
> > Hi,
>
> > Has anyone measured and compared performance of Krati vs. HashMap?
>
> Would this be a useful comparison, given that HashMap is in-memory,
> Krati disk backed?
>
> For what it's worth, a project I am working on, tr13, can be used as a
> replacement for HashMap -- it is in-memory only (although using
> memory-mapped files might be easy to do) for what that's worth. And
> someone did compare its performance; HashMap is faster, since tr13
> optimizes for compactness and very low GC activity (single byte array
> or ByteBuffer); see this discussion:http://groups.google.com/group/ning-tr13-users/browse_thread/thread/a...

Tatu Saloranta

unread,
Oct 17, 2010, 1:58:03 PM10/17/10
to kr...@googlegroups.com
On Sat, Oct 16, 2010 at 9:46 PM, Avatar <jingw...@gmail.com> wrote:
> Hello Tatu,
>
> Actually, it is relatively simple to support the in-memory only mode
> in Krati. All it takes is to implement a new segment that is not
> backed by disk files for persistency. The current MemorySegment class
> is a good place to look at for achieving such a purpose.

Very cool -- if I have time, I would like to explore that possibility.
If so, will send a note.

-+ Tatu +-

Tatu Saloranta

unread,
Oct 17, 2010, 1:59:48 PM10/17/10
to kr...@googlegroups.com
On Sun, Oct 17, 2010 at 5:38 AM, Otis <otis.gos...@gmail.com> wrote:
> Hi,
>
> I thought the comparison might make sense because, it is my
> understanding, Krati is smart about in-memory and on-disk storage
> which, I thought, is one of the reasons its an improvement over BDB
> JE.  I could have things wrong! :)

Good point, if all data fits in memory it would be nice to know
relative overhead.

> Thanks,
> Otis
> P.S.
> Tatu, we have some crazy mailing list overlap - wherever I go I see
> your name. :)

Hehe, I thought something along these lines too. :)

-+ Tatu +-

John Wang

unread,
Oct 17, 2010, 5:31:19 PM10/17/10
to kr...@googlegroups.com
Hi Otis:

    I might have given you misleading information on Krati's memory usage.

    Krati loads the offset array (fixed size array containing offsets to the actual storage file) all into memory. This is something Lucene does not do (and hence pays for it with extra disk seeks), and BDB loads into a BTree. Thus, one area Krati can be efficient is the offset look up with is just an array lookup comparing to Lucene (2 seeks) or BDB BTree navigation.

   Jingwei can chime in more in detail. My apologies for misleading you thinking Krati is purely in-memory.

-John

On Sun, Oct 17, 2010 at 5:38 AM, Otis <otis.gos...@gmail.com> wrote:

Otis

unread,
Oct 17, 2010, 11:20:48 PM10/17/10
to Krati
Hi John,

Oh, I didn't think Krati was completely in-memory. My understanding
was that it's just smart about having some data in memory (+
persisting it to disk), while storing other data on disk.

I think this like from http://sna-projects.com/krati/ made me think
so:
"is memory-resident (or OS page cache resident) yet persistent"

Thanks,
Otis

On Oct 17, 5:31 pm, John Wang <john.w...@gmail.com> wrote:
> Hi Otis:
>
>     I might have given you misleading information on Krati's memory usage.
>
>     Krati loads the offset array (fixed size array containing offsets to the
> actual storage file) all into memory. This is something Lucene does not do
> (and hence pays for it with extra disk seeks), and BDB loads into a BTree.
> Thus, one area Krati can be efficient is the offset look up with is just an
> array lookup comparing to Lucene (2 seeks) or BDB BTree navigation.
>
>    Jingwei can chime in more in detail. My apologies for misleading you
> thinking Krati is purely in-memory.
>
> -John
>

Avatar

unread,
Oct 18, 2010, 10:36:22 AM10/18/10
to Krati
Hello Otis,

In fact, Krati can be completely memory resident depending on which
type of Segment is used.

If MemorySegment is used, it is completely memory-resident but cannot
scale beyond physical memory. It is good for read and write.

If MappedSegment is used, it is OS page cache resident and good for
read when data size is less than 2Xphysical memory. Beyond that, write
performance is not good.

If ChannelSegment is used, it is OS page cache resident and slow for
both read and write.

If WriteBufferSegment is used, writes memory-resident and reads are
from OS page cache. This type of segment combined with
IndexedDataStore provides the best performance for large data sets
beyond the size of physical memory. It is good for bootstrapping a
data store, which can be copied to any machine for serving.

Please note that the batch-based persistency is always enabled for
writes no matter which Segments are used.

At the current stage, Krati data store does not support automatic
segment switching at runtime. This means if you choose to use
MemorySegment when JVM starts, you have to stick with it until JVM
stops.

Thanks.

-jingwei

On Oct 17, 8:20 pm, Otis <otis.gospodne...@gmail.com> wrote:
> Hi John,
>
> Oh, I didn't think Krati was completely in-memory.  My understanding
> was that it's just smart about having some data in memory (+
> persisting it to disk), while storing other data on disk.
>
> I think this like fromhttp://sna-projects.com/krati/made me think
Reply all
Reply to author
Forward
0 new messages