Does anybody has any recommendations on what the optimal BDB
checkpoint configuration settings might be? Right now we are using
the default settings on Project Voldemort's configuration page, i.e.,
20 MB interval and 30000 ms interval. What we noticed is that lots of
Voldemort threads are blocked when there are active checkpoint
activities, and that coincides with our application showing really bad
max response (multi seconds), and Voldemort client complaining about
operation timed out.
Our current test settings:
je.cleaner.minUtilization=25
je.cleaner.threads=10
je.cleaner.readSize=1024000
je.cleaner.lockTimeout=100000
je.checkpointer.highPriority=true
je.env.backgroundReadLimit=5
je.env.backgroundWriteLimit=5
A few notes about these settings. We were currently using %5
minUtilization and 1 cleaner threads, but we are running out disk
space very quickly so one of our goal is to increase the
minUtilization to reclaims some disk space back. The other settings
were per Oracle recommendations to reduce disk IO and hopefully reduce
response time.
In our test environment we have a two node Voldemort cluster with
required read and write at 2. Our BDB log size is 10 MB. We tested 1
GB, 256 MB, 50 MB, and 10 MB log file sizes and 10 MB gave the best
response time. Our test starts with about 10GB data file, bdb file
utilization stays around 25% through out the test.
We are going to run a test with all the above settings but with
checkpoint high priority off to see whether it will make any
difference. In the meantime I'd appreciate any advice in terms of how
we might adjust the checkpoint intervals.
I'll cross post to BDB mailing list to see whether I might get some
pointers over there as well.
Thanks.
-Feng
Thanks.
-Feng