Performance Benchmarking

349 views
Skip to first unread message

Richard Rowlands

unread,
Jul 20, 2015, 7:02:56 PM7/20/15
to ma...@googlegroups.com
Hey there!

We're considering using your library so I wrote up a little benchmarking program to compare your library with Ehcache (2.x) and JCS (1.3). You will be pleased to know that your library smoked the competition!

This test created an in memory cache (max size 2000 records) backed by a file system cache. This cache was then used to put 3000 objects (1000 spilling to disk) followed by 300 gets (spaced evenly), causing some expected cache misses and disk references. The results for each library are as follows:

jcs = 28.260615833s
mapdb = 0.385666594s
ecache = 0.490844965s

When this test is performed with a max memory cache size of 20,000 records and 30,000 puts (followed by 3000 gets) the results are:

mapdb = 3.817517903s
ecache = 4.119665308s

You will notice that JCS is no longer listed. This is because JCS (despite their wild claims) never finishes the test! I stopped it after 10 minutes, it may presumably take hours to finish and I don't have the time to wait around for it. It is possible that I have JCS misconfigured?

So there it is! My source code for this test is attached. This test was performed with Java 8.
cachebenchmark.zip

Jan Kotek

unread,
Aug 7, 2015, 10:04:51 AM8/7/15
to MapDB
Hi,

thanks for benchmark, I will integrate it into mapdb.org.

BTW 20.000 entries is bit too small, MapDB would really shine with millions or billions records :-)

Jan 

Christian MICHON

unread,
Aug 7, 2015, 11:01:59 AM8/7/15
to MapDB
For billions some heavy tuning will be needed due to RAM consumption... You might want to rephrase the statement...

Peter Borissow

unread,
Aug 7, 2015, 11:09:21 AM8/7/15
to ma...@googlegroups.com
I've successfully processed billions of records (OSM data) with minimal RAM requirements (<6GB) on Windows using MapDB 1.x. I haven't test 2.x yet but I don't expect that it would require more RAM than 1.x

Peter


From: Christian MICHON <christia...@gmail.com>
To: MapDB <ma...@googlegroups.com>
Sent: Friday, August 7, 2015 11:01 AM
Subject: [MapDB] Re: Performance Benchmarking

--
You received this message because you are subscribed to the Google Groups "MapDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mapdb+un...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Christian MICHON

unread,
Aug 7, 2015, 5:14:54 PM8/7/15
to MapDB
Please share how you configured mapdb.

It might be insightful. I'm quite sure you must have constrained the cache.

How big is the osm data? Did you read a compressed stream or just a big file?

Peter Borissow

unread,
Aug 7, 2015, 6:36:28 PM8/7/15
to ma...@googlegroups.com
Here's the discussion from a few months ago:


OSM data is read in as a stream for both XML and PBF encoded data. I haven't had a chance to revisit this with 2.x but hopefully I'll get a chance in the fall.


Hope that helps,
Peter


From: Christian MICHON <christia...@gmail.com>
To: MapDB <ma...@googlegroups.com>
Sent: Friday, August 7, 2015 5:14 PM
Subject: Re: [MapDB] Re: Performance Benchmarking

Please share how you configured mapdb.

It might be insightful. I'm quite sure you must have constrained the cache.

How big is the osm data? Did you read a compressed stream or just a big file?



--
You received this message because you are subscribed to the Google Groups "MapDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mapdb+unsub...@googlegroups.com.

Andrew Byrd

unread,
Aug 11, 2015, 12:13:46 PM8/11/15
to ma...@googlegroups.com
Hello,

We also use MapDB in production to store, index, and retrieve OSM data for the entire planet. The source data is a 28GB PBF file (i.e. deflated, protobuf-encoded OSM data). It is roughly double that size as stored in MapDB with some custom serializers.

You can see the source code here:
https://github.com/conveyal/osm-lib/blob/master/src/main/java/com/conveyal/osmlib/OSM.java

Andrew

Jan Kotek

unread,
Aug 11, 2015, 7:03:07 PM8/11/15
to MapDB
> You might want to rephrase the statement...

Not really, MapDB can be much faster and more memory efficient. 

I checked your code, it uses slow generic serialization. Specialized serializers (.keySerializer(Serializer.STRING) will improve performance a lot.
It also reduces memory consumption. And there is stuff like key delta compression on btrees...
Billions of entries are possible with a few gigabytes of memory. 

Anyway, I really should update benchmarks and documentation on this.

Regards,
Jan 

Jan Kotek

unread,
Aug 11, 2015, 7:03:16 PM8/11/15
to MapDB
> You might want to rephrase the statement...

Not really, MapDB can be much faster and more memory efficient. 

I checked your code, it uses slow generic serialization. Specialized serializers (.keySerializer(Serializer.STRING) will improve performance a lot.
It also reduces memory consumption. And there is stuff like key delta compression on btrees...
Billions of entries are possible with a few gigabytes of memory. 

Anyway, I really should update benchmarks and documentation on this.

Regards,
Jan 


On Friday, August 7, 2015 at 5:01:59 PM UTC+2, Christian MICHON wrote:
Reply all
Reply to author
Forward
0 new messages