Use transactional writes on async flush?

470 views
Skip to first unread message

Sam Hendley

unread,
Jul 11, 2014, 4:04:01 PM7/11/14
to ma...@googlegroups.com
I am trying to use map db to replace ehcahce in my system. So far I have been very impressed with MapDb's performance but it has one limitation that is hanging me up.

What I really want is a cache that is persistent but I don't care if I lose a small number of updates on an unclean shutdown. Sounds like the use case for transactionDisable() and asyncWriteEnable() but when I combine them I get index out of bound exceptions. I noticed there was a commit this week that may have been a fix for that issue, is that true? Are these features supposed to work together?

In general are the "transactions" provided by transacationDisable() "user facing" or are they talking about the writes to the disk? What I don't want to have happen is that an unclean shutdown poisons the entire cache so I have rebuild it from scratch if my process is killed aggressively with "kill -9" or a reboot. It looks like the tradeoff is call Db.commit() on some sort of timer but that seems like a usecase that should be handled internally.

For reference this is my config file for a "InMemory cache with best effort persistence". Anything I can do to keep this basic performance and not lose my cache to corrupted indexes?

return DBMaker
                .newFileDB(file)
                .transactionDisable()
                .asyncWriteEnable()
                .asyncWriteFlushDelay(500)
                .asyncWriteQueueSize(100000)
                .mmapFileEnableIfSupported()
                .cacheSize(20000000)
                .cacheHardRefEnable()
                .commitFileSyncDisable()
                .make();

Thanks, If your curious I also have some plots showing ehcache vs mapdb performance and memory usage.

Sam

Peter Blakeley

unread,
Jul 12, 2014, 3:47:27 AM7/12/14
to ma...@googlegroups.com
I would be interested in plots also in how you find the reload time of your cache, do you load it in back into memory?

I found in the past better to throw away the caches(of search indexes) to speed reload of the application be interested to see your result

cheers pb...

Jan Kotek

unread,
Jul 12, 2014, 10:12:01 AM7/12/14
to ma...@googlegroups.com, Peter Blakeley

> cache that is persistent but I don't care if I lose a small number of updates on an unclean shutdown.

 

In this case leave default settings and commit periodically. You will only loose data since last commit.

 

>Sounds like the use case for transactionDisable()

 

`transactionDisable()` is different. It trades all protections for write speed. Not just on modified data, but on entire store. So on unclean shutdown you are most likely to corrupt entire store.

 

> asyncWriteEnable() but when I combine them I get index out of bound exceptions.

 

Async write has some overhead, so it depends on scenario if it actually improves performance.

 

Please send me stack trace. You might get exception after unclean shutdown while reopening. Otherwise it is bug.

There is a bug in asyncWrite being fixed in MapDB 1.0.5, perhaps it is the same.

 

> Are these features supposed to work together?

 

Yes, anything with stack trace is a probably a bug. If not there should be some sort of warning or InvalidArgumentException at config time.

 

> In general are the "transactions" provided by transacationDisable() "user facing" or are they talking about the writes to the disk?

 

There are three transactions modes:

 

1) WriteAheadLog: on by default and crash resistant. Single global transaction per store.

 

2) transactionDisabled: no WAL, changes written directly to files, data gone if JVM crashes.

 

3) TxMaker: A concurrent transaction with MVCC serializable snapshots.

 

In short if you are not sure, use WAL.

 

> For reference this is my config file for a "InMemory cache with best effort persistence". Anything I can do to keep this basic performance and not lose my cache to corrupted indexes?


return DBMaker

                .newFileDB(file)

// do not use this, or you will losoe all data

                .transactionDisable()

 

//disable async, unless tests shows better performance

                .asyncWriteEnable()

                .asyncWriteFlushDelay(500)

                .asyncWriteQueueSize(100000)

 

//good

                .mmapFileEnableIfSupported()

 

//hard ref cache does not have

                .cacheSize(20000000)

                .cacheHardRefEnable()

 

//bad idea unless you know what it does

                .commitFileSyncDisable()

 

 

                .make();

 

BTW HashMap has optional time based eviction...

 

 

>I would be interested in plots also in how you find the reload time of your cache, do you load it in back into memory?

No, store on disk is not loaded into memory.

 

> I found in the past better to throw away the caches(of search indexes) to speed reload of the application be interested to see your result

 

That is probably best if your crashes are not frequent and data are not important.

Just disable transactions. In case of crash the MapDB will throw an exception while you reopen the store.

So catch the exception, wipe store (files) and start over.

 

Option to wipe corrupted store will be added in 1.1 in about two months.

 

Jan

--
You received this message because you are subscribed to the Google Groups "MapDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mapdb+un...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



Sam Hendley

unread,
Jul 14, 2014, 8:07:05 AM7/14/14
to ma...@googlegroups.com, Peter Blakeley
Thanks for the feedback. I think you are right, I probably don't need commitFileSyncDisable() anymore.

I probably wasn't as clear as I should be on what I am looking for. Most of the data is resyncable from another source but this store is the primary location for a set of data. It can be "relearned" from the stream we are processing but there would be some degradation in performance if that data is lost. 

All I really need is a Map that is eventually disk backed, a few lost entries isn't a big deal but I am trying to avoid any disk access on the main processing thread. In short I am looking for similar behavior as redis, blazing fast get/sets which will be persisted to disk sometime in the next few seconds. I am trying to avoid any disk operations on the main path, it seems like what I want is an "in memory transaction log" so no writes are done to the data file until they can be done safely. Is that what the async write accomplishes? 

For reference when I leave transactions on with async write, here is the exception I was getting, this looks like it is related to issue #356:
java.lang.IndexOutOfBoundsException
at java.nio.Buffer.checkIndex(Buffer.java:538)
at java.nio.DirectByteBuffer.putLong(DirectByteBuffer.java:796)
at org.mapdb.Volume$ByteBufferVol.putLong(Volume.java:304)
at org.mapdb.StoreWAL.walIndexVal(StoreWAL.java:309)
at org.mapdb.StoreWAL.preallocate(StoreWAL.java:213)
at org.mapdb.AsyncWriteEngine.preallocateNoCommitLock(AsyncWriteEngine.java:303)
at org.mapdb.AsyncWriteEngine.put(AsyncWriteEngine.java:369)
at org.mapdb.EngineWrapper.put(EngineWrapper.java:53)
at org.mapdb.Caches$LRU.put(Caches.java:61)
at org.mapdb.BTreeMap.put2(BTreeMap.java:778)
at org.mapdb.BTreeMap.put(BTreeMap.java:643)
at com.sms.channelization.caching.MapWrapper.put(MapWrapper.java:20)

I built and tried the same test with 1.1.0-SNAPSHOT and got a slightly different exception but it still appears to have problems. This failed after loading 1.211M (out of 5M) 8 byte entries into the treeset. 

java.lang.IndexOutOfBoundsException
at java.nio.Buffer.checkIndex(Buffer.java:538)
at java.nio.DirectByteBuffer.putLong(DirectByteBuffer.java:796)
at org.mapdb.Volume$ByteBufferVol.putLong(Volume.java:335)
at org.mapdb.StoreWAL.walIndexVal(StoreWAL.java:357)
at org.mapdb.StoreWAL.preallocate(StoreWAL.java:188)
at org.mapdb.EngineWrapper.preallocate(EngineWrapper.java:49)
at org.mapdb.AsyncWriteEngine.put(AsyncWriteEngine.java:312)
at org.mapdb.EngineWrapper.put(EngineWrapper.java:59)
at org.mapdb.Caches$LRU.put(Caches.java:68)
at org.mapdb.BTreeMap.put2(BTreeMap.java:779)
at org.mapdb.BTreeMap.put(BTreeMap.java:644)
at com.sms.channelization.caching.MapWrapper.put(MapWrapper.java:20)

I thought perhaps it was because I wasn't calling db.commit() so I added a commit after 1000 records and it deadlocks during write:

"main" prio=6 tid=0x0000000001e4f800 nid=0x9364 waiting on condition [0x000000000216d000]
   java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  <0x000000075781a4f8> (a java.util.concurrent.CountDownLatch$Sync)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1033)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1326)
at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:282)
at org.mapdb.AsyncWriteEngine.waitForAction(AsyncWriteEngine.java:469)
at org.mapdb.AsyncWriteEngine.commit(AsyncWriteEngine.java:489)
at org.mapdb.EngineWrapper.commit(EngineWrapper.java:100)
at org.mapdb.DB.commit(DB.java:1596)
- locked <0x0000000756db5050> (a org.mapdb.DB)
at com.sms.channelization.caching.MapDbChannelizationCacheProvider$1.put(MapDbChannelizationCacheProvider.java:95)

"MapDB writer #1" daemon prio=6 tid=0x000000001bf9a800 nid=0x85f0 waiting on condition [0x000000001ccbe000]
   java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:349)
at org.mapdb.AsyncWriteEngine$WriterRunnable.run(AsyncWriteEngine.java:166)
at java.lang.Thread.run(Thread.java:744)

        return DBMaker
                .newFileDB(file)
                .asyncWriteEnable()
                .asyncWriteFlushDelay(500)
                .asyncWriteQueueSize(100000)
                .mmapFileEnableIfSupported()
                .cacheSize(20000000)
                .cacheHardRefEnable()
                .make();

It looks like there may be some latent threading issues between async writes and commit. I will try and repackage my test into a unit test in a mapdb fork so it's easier for you to track them down but this bug was relatively easy to reproduce by trying to load small entries into a database as fast as possible (my use case is 5M entries).

Thanks again for the feedback, if you'd like more specifics let me know.

Sam Hendley


--
You received this message because you are subscribed to a topic in the Google Groups "MapDB" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/mapdb/fCM3iMjpvmE/unsubscribe.
To unsubscribe from this group and all its topics, send an email to mapdb+un...@googlegroups.com.

Sam Hendley

unread,
Jul 14, 2014, 8:17:23 AM7/14/14
to ma...@googlegroups.com
I was doing head to head comparisons of our existing ehcache implementation and MapDb.

In answer to your direct question I found that the startup time of the cache with 5M entries was around 1 second which is plenty fast for our purposes.

 5M Entry map

EhCache

MapDb

File Size

2500 MB

150 MB

Start Time

157 seconds

1 second

Bootstrap Time

60 seconds

10 seconds

Shutdown Time

120 seconds

1 second

Gets a second

600 K/s

750 K/s

Max Memory

4.5 GB

3 GB


What was almost more impressive was how much nicer the memory usage was. Here are some plots showing memory usage with ehcache on left and mapdb on the right. The second plot shows load average on the machine vs. processing rate. Notice the rate is very stable and the load average is consistently lower than with ehcache. 

Inline image 1

Inline image 2


For reference here were my configurations:


EhCache 2.5.4

<diskStore path="/opt/var/db"/>

  <defaultCache

            eternal="true"

            timeToIdleSeconds="0"

            timeToLiveSeconds="0"

            overflowToDisk="true"

            maxEntriesLocalHeap="20000000"

            diskPersistent="true"

            maxElementsOnDisk="0"

            diskExpiryThreadIntervalSeconds="0"

            memoryStoreEvictionPolicy="LRU"

            />


MapDb 1.04

DBMaker

                .newFileDB(file)

                .transactionDisable()

                .asyncWriteEnable()

                .asyncWriteFlushDelay(500)

                .asyncWriteQueueSize(100000)

                .mmapFileEnableIfSupported()

                .cacheSize(20000000)

                .cacheHardRefEnable()

                .commitFileSyncDisable()

                .make();


--

Jan Kotek

unread,
Jul 15, 2014, 6:08:05 AM7/15/14
to ma...@googlegroups.com

Problem in 1.0.4 seems like something I fixed recently (fix will be in 1.0.5).

 

I do not thing there is a deadlock in 1.1.0-SNAPSHOT.

I got simple test case to reproduce issue, but no luck (attached)

 

Code audit shows offending line in writer:

 

//if conditions are right, slow down writes a bit

if(asyncFlushDelay!=0 &&

!commitLock.isWriteLocked() &&

size.get()<maxParkSize){

LockSupport.parkNanos(1000L * 1000L * asyncFlushDelay);

}

 

 

This can not cause deadlock. It will wait for asyncFlushDelay miliseconds and than continue to dump write queue content.

 

I believe that  this setting is too high: `.asyncWriteFlushDelay(500)`

The `maxParkSize` is 1/4 of queue size, so it always parks even if there are items to be written.

 

So either increase commit interval to something comparable to asyncQueueSize,

or set flush delay to zero or something smaller.

 

Jan

AsyncWriteDeadlock.java

Jan Kotek

unread,
Jul 15, 2014, 6:16:04 AM7/15/14
to ma...@googlegroups.com

Nice, could you make blog post out of it? Or could I use it in testimonials?

I could use some user stories.

 

I believe GC spikes are caused by hard-reference instance cache. It clears itself when free memory is low. Perhaps leave default cache to improve this.

If you run out of memory, decrease cache size.

 

Also you might call `db.clearCache()` manually.

 

 

 

> .commitFileSyncDisable()

 

Hm, you may loose data. What is point of WAL if it is not synced?

MapDB could have some problems to recover in this case.

 

It is probably better to disable transactions and just discard the store in case of unclean shutdown.

 

WAL performance should be fixed in 1.1 with append-only store and some more features.

 

Jan



--

You received this message because you are subscribed to the Google Groups "MapDB" group.

To unsubscribe from this group and stop receiving emails from it, send an email to mapdb+un...@googlegroups.com.

image.png
Reply all
Reply to author
Forward
0 new messages