MapDB performance experiments

Brandon Martin-Anderson

unread,

Sep 11, 2014, 10:25:58 PM9/11/14

to Open Trip Planner Dev

I've been looking into MapDB as a possible bucket for graph information. Possible advantages: very short graph startup, outsourcing the persistence layer. Possible disadvantage: slower. So, if it isn't slower, it's a big win.

Establishing a baseline

The data I worked with is a street graph based on an OSM extract of every road in New York State. First, I wrote a custom minimal graph with hard-wired object references between graph vertices, and populated it with OSM data.

I wrote a pretty minimal dijkstra algorithm using Java's built-in priority queue. The algorithm didn't do any key decreasing - it just holds a pqueue of edges and any time popped edge leads to somewhere already in the SPT it just ignores it and pops the next off the queue. I ran that on the NYS data to get a baseline of how long this particular dumb toy dijkstra implementation takes to exhaust an SPT on a very fast in-memory graph. Here's the first 8 retries in milliseconds {3250, 1082, 1000, 987, 957, 986, 974, 977}. That is to say, it takes about 1.00 seconds. This is the performance of the algorithm without MapDB.

Second I pushed the entire NYS dataset into MapDB and called get() on every key in arbitrary order. Note this MapDB was created using .mmapFileEnable().cacheSize(32768*8) options. The first 8 retries in milliseconds: {6702, 435, 434, 433, 441, 439, 432, 432, 437, 462}. 0.44 seconds is a reasonable mode. This is the performance of MapDB without the pathfinding algorithm.

Testing pathfinding on MapDB

The MapDB version of the graph is simple. An object "SimpleVertex" contains an array of "SimpleEdge" objects, which each have a distance and a string member that points to the label of the destination vertex. Each vertex is stored in MapDB under its string label.

The toy dijkstra algorithm was modified to run on the MapDB graph, and then SPTs were created on the entire New York State dataset. Using MapDB right out the box with no performance enhancement the running times, in milliseconds, were: {14614, 4910, 3890, 4193, 4104, 6414, 5200, 8220}. Mean time after the first: 5276 ms.

with: cacheHardRefEnable()
{15833, 4085, 2344, 2338, 3094, 2253, 2946, 2186}
Mean, after the first: 2749 ms.

with: cacheSize(32768*2)
{14283, 2193, 2024, 2144, 2032, 2070, 3053, 3474}
Mean, after the first: 2427 ms.

with:
cacheSize(32768*8)
{8710, 2167, 2261, 2068, 1951, 1997, 1996, 2052}
Mean, after the first: 2070 ms.

with:
.cacheSize(834231)
{48379, 38081, 34330, 34526, 33249}
I think a MapDB cache size larger than Java's heap size may have resulted in a lot of swapping.

with:
cacheSize(32768*8).cacheHardRefEnable()
{8832, 2225, 2175, 2093, 2068, 2112, 2081, 2081}
Mean: 2119 ms. Not better than simply cacheSize(32768*8).

with:
mmapFileEnable(). Note this requires a MapDB graph rebuild.
{7304, 3385, 3320, 3321, 4121, 3079, 3011, 3095}\
Mean: 3333 ms. Not a winner.

with:
mmapFileEnable().cacheSize(32768*8)
{6962, 2066, 2068, 2139, 2089, 2035, 2107, 2128}
Mean: 2090ms. Tied with cacheSize(32768*8).

with:
mmapFileEnable().cacheSize(32768*8).cacheHardRefEnable()
{7794, 2119, 3074, 2054, 1998, 1989, 1983, 2021}
Mean: 2177 ms. Median: 2021ms. I mention median here because if it wasn't for that one 3074ms time the performance would be pretty good.

The best performance simply comes from increasing the cache to some large fraction of the size of the graph, at which point we get SPT exhaustion times around 2.0 seconds.

Conclusion

Our best running times in MapDB were about 2.0 seconds, of which 1.0 seconds was spent outside MapDB. So, while creating an exhaustive SPT over the dataset, MapDB spent about 1.0 seconds. We know that accessing every object in MapDB by key "in order" takes about 0.44 seconds, indicating that approximately half the time, 0.56 seconds, is a performance hit resulting from the particular order that the dijkstra algorithm accessed objects in MapDB.

These running times are decent, but if the goal is to achieve routing times across the eastern seaboard in under 1.0 seconds it may not be possible unless the search is pruned to access a small fraction of the vertices in the graph.

Laurent Gregoire

unread,

Sep 12, 2014, 5:39:36 AM9/12/14

to opentripplanner-dev

Hi Brandon,

On 12/09/14 04:25, Brandon Martin-Anderson wrote:
> I've been looking into MapDB as a possible bucket for graph information.
> Possible advantages: very short graph startup, outsourcing the persistence
> layer. Possible disadvantage: slower. So, if it isn't slower, it's a big
> win.

Thanks for sharing those results.

Just a question: what is the graph size / RAM size ratio during the
test? AFAIK MapDB uses a memory-mapped file to store the data, using a
custom optimized serializer to serialize the POJO to store. As long as
there is free memory, the OS may cache the mmap'ed pages read (even if
MapDB cache is zero). This ratio is probably relevant to the performance
stats we see.

Just to make sure: Is the purpose of MapDB to be able to process data
larger than memory size, or only to optimize size by serialization only?

I do not want to be pessimistic, but a page cache miss will incur a
read, time-consuming with a spinning-disk. As a search is mono-threaded
and I/O bound, any I/O overhead will bound the time spend. All this
depends on the access pattern of the data; but since there is no easy
way to force spatial locality on the stored data, we may end-up with
lots of cache misses during a search. And multiple concurrent searches
would decrease the effective cache size for each thread too.

Maybe we could use a Morton code for the key? I don't know if you can
instruct MapDB to use key ordering for forcing data locality, but if
it's the case that could improve the cache hit ratio.

HTH,

--Laurent

andrew byrd

unread,

Sep 12, 2014, 6:49:35 AM9/12/14

to opentripp...@googlegroups.com

Hello,

Thanks for posting these results, Brandon. A few questions and comments:

- Are all of your tests on file-backed DBs? I see that only some of them
had memory-mapped files enabled, but were the DBs always created with
newFileDB or newTempFileDB rather than one of the memory-backed options?

- It sounds like your test edges are pretty small compared to real OTP
edges. Considering that read times are going to be proportional to the
amount of data as well as the amount of objects, could you pad them to
approximate the larger edges we'd use in practice?

- You show that when the cache is too large relative to the Java heap,
the search slows down by an order of magnitude. I suspect this slowdown
is caused by garbage collection rather than swapping. The latter would
occur when the number of memory pages in use exceeded available physical
pages, rather than available Java heap. When memory-mapped files are
enabled, we are actually trying to cause swapping -- the idea is to use
the expertly tuned OS code to handle pulling serialized edges into live
memory for us, at which point MapDB will deserialize them into its own
Java object cache one layer up. It would be good to have some
information on the number of garbage collection and paging operations
from profiling tools.

Writing about this now, unless I'm misunderstanding something I see a
serious shortcoming in how MapDB works (which to be fair is imposed by
Java itself). All of the edges (or whatever objects) being actively used
must appear in memory _twice_ simultaneously: once in serialized form in
the memory-mapped file pages managed by the OS, and once in Java object
form in memory handled by the JVM. This stands in contrast to the way I
would typically use mmap in other environments: to allocate a huge array
of disk-backed structs that is operated on directly. That would be a
fantastic way to store a street network (and will eventually be used in
RRRR).

I also have some reactions to Laurent's comments, which are inline
below:

On Fri, Sep 12, 2014, at 11:33, Laurent Gregoire wrote:
> Just a question: what is the graph size / RAM size ratio during the
> test? AFAIK MapDB uses a memory-mapped file to store the data, using a
> custom optimized serializer to serialize the POJO to store. As long as
> there is free memory, the OS may cache the mmap'ed pages read (even if
> MapDB cache is zero). This ratio is probably relevant to the performance
> stats we see.

Laurent makes a good point here: your OS is probably caching the backing
store in memory. After the initial load, "disk" reads might not even be
touching the disk. This is not only true for memory-mapped files -- even
regular files would be cached if the OS had the memory to do so. It
would be helpful to get some results in a truly memory-constrained
environment.

> Just to make sure: Is the purpose of MapDB to be able to process data
> larger than memory size, or only to optimize size by serialization only?

The possible outcomes of this optimization are to reduce live memory
consumption by letting less frequently used parts of the graph be
swapped out, and also to decrease cold startup time by
lazy-deserializing the graph.

> I do not want to be pessimistic, but a page cache miss will incur a
> read, time-consuming with a spinning-disk. As a search is mono-threaded
> and I/O bound, any I/O overhead will bound the time spend.

Absolutely, this is why we never even really considered storing the
street edges in a database before. But we thought it would be better to
dismiss the idea based on empirical evidence, in case we're pleasantly
surprised by the results :) The experiment has become more convenient to
carry out with a database engine like MapDB.

> All this
> depends on the access pattern of the data; but since there is no easy
> way to force spatial locality on the stored data, we may end-up with
> lots of cache misses during a search. And multiple concurrent searches
> would decrease the effective cache size for each thread too.

All true -- at this point we're just checking viability and
order-of-magnitude performance. If it looks usable then we have to check
it under load with a realistic pattern of searches. It wouldn't be
surprising if real-world searches are very concentrated on certain
subgraphs.

> Maybe we could use a Morton code for the key? I don't know if you can
> instruct MapDB to use key ordering for forcing data locality, but if
> it's the case that could improve the cache hit ratio.

If anyone wants to check the effects of memory layout on performance,
there is already a MortonVertexComparator in OTP for this exact purpose.

-Andrew

Brandon Martin-Anderson

unread,

Sep 15, 2014, 7:03:48 PM9/15/14

to andrew byrd, Open Trip Planner Dev

1) The tests were on file-backed DBs, yes.

2) I gave each edge a 256-byte payload of random bits. The following performance tests are from the New York State street graph with padded edges. It's about 2gb on disk. For these tests I ran Java with the argument "-Xmx1000m". I made and used the database with the ".mmapFileEnable()" option, with the default cache size.

When attempting to access every item by key, it stops at about the 760000th iteration with "Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit exceeded".

When attempting to build an exhaustive SPT, it stops at about the 1470000th iteration with the same error message.

Increasing memory using "-Xmx8000m":

Accessing every item: 8 tries, in milliseconds: {6238, 3277, 1362, 1349, 1302, 1253, 1297, 1243, 1305, 1306}.

Exhaustive SPTs: elapsed: {14700, 5216, 4664, 3602, 3689, 3444, 3403, 3338}

--==--

I'd do some more tests but this isn't looking super hopeful.

-B

-Andrew

--
You received this message because you are subscribed to the Google Groups "OpenTripPlanner Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to opentripplanner...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all

Reply to author

Forward