BDB one env per store

164 views
Skip to first unread message

Yoav

unread,
Aug 29, 2010, 2:25:04 AM8/29/10
to project-voldemort
Hi,

We are using the above setting to create a separate environment for
each store.
Below is the configuration we use in server.xml
When Voldemort starts I see the following log lines, for each store we
have defined: (we have about 6)
[2010-08-29 06:16:03,445] INFO Creating environment for Store1:
(voldemort.store.bdb.BdbStorageConfiguration)
[2010-08-29 06:16:03,445] INFO BDB cache size = 2621440000
(voldemort.store.bdb.BdbStorageConfiguration)
[2010-08-29 06:16:03,445] INFO BDB je.cleaner.threads = 1
(voldemort.store.bdb.BdbStorageConfiguration)
[2010-08-29 06:16:03,445] INFO BDB je.cleaner.minUtilization = 25
(voldemort.store.bdb.BdbStorageConfiguration)
[2010-08-29 06:16:03,445] INFO BDB je.cleaner.minFileUtilization =
5 (voldemort.store.bdb.BdbStorageConfiguration)
[2010-08-29 06:16:03,445] INFO BDB je.log.fileMax = 62914560
(voldemort.store.bdb.BdbStorageConfiguration)

Does this imply that each store would have a 2500MB cache, its own
cleaner thread, etc.?

Thanks!

Appnedix: bdb configuration
# BDB
bdb.sync.transactions=false
bdb.cache.size=2500MB
bdb.max.logfile.size=60MB
bdb.one.env.per.store=true
je.cleaner.minUtilization=25
bdb.cleaner.minUtilization=25
je.cleaner.threads=1
je.cleaner.readSize=102400
je.cleaner.lockTimeout=10000
je.checkpointer.highPriority=false
je.env.backgroundReadLimit=5
je.env.backgroundWriteLimit=5

ijuma

unread,
Aug 29, 2010, 7:29:47 AM8/29/10
to project-voldemort
On Aug 29, 7:25 am, Yoav <yoavna...@gmail.com> wrote:
> Does this imply that each store would have a 2500MB cache

No, we use the shared cache option so that all environments/stores
share the same cache.

> its own cleaner thread, etc.?

This is true, each environment/store has its own cleaner thread.

Best,
Ismael

David Gevorkyan

unread,
Feb 7, 2014, 2:33:32 PM2/7/14
to project-...@googlegroups.com
Hi Guys,

Is the cache still shared between different Environments in Voldemort 1.3 as well?
Basically, if I have the following properties defined:

bdb.cache.size=25G

bdb.one.env.per.store=true

And I have 2 stores, will 25G be allocated to each environment, basically expecting me to have 50G of heap?


Thanks in advance!


Sincerely,
David Gevorkyan

Brendan Harris (a.k.a. stotch on irc.oftc.net)

unread,
Feb 7, 2014, 11:40:19 PM2/7/14
to project-...@googlegroups.com
Hi David,


On Friday, February 7, 2014 11:33:32 AM UTC-8, David Gevorkyan wrote:
Hi Guys,

Is the cache still shared between different Environments in Voldemort 1.3 as well?

Yes, the bdb cache is always shared between all stores by default. This can be overridden, however, by setting "<memory-footprint>n</memory-footprint>" in each store config in the main "<store></store>" element, which will carve out a chunk of the global cache for the store you put that setting in. That size for that parameter is specified in megabytes. Be careful with that setting, though. We have found that sometimes bdb-je needs to consumer extra memory to catch up compaction under certain scenarios.

Basically, if I have the following properties defined:

bdb.cache.size=25G

bdb.one.env.per.store=true

And I have 2 stores, will 25G be allocated to each environment, basically expecting me to have 50G of heap?

No. They'll both share the whole 25G and, when necessary, evict each other's nodes. bdb.cache.size is a fixed global cache size.

~B

Vinoth C

unread,
Feb 8, 2014, 3:13:02 PM2/8/14
to project-...@googlegroups.com
+1 for Brendan's answer..

We have found memory-footprint to cordon off misbehaving stores.. But if you have an unexpected spike or something then you might outgrow your footprint and stress the cleaners

David Gevorkyan

unread,
Feb 10, 2014, 9:59:16 PM2/10/14
to project-...@googlegroups.com
Thanks a lot guys,

I was seeing very strange Memory utilization graphs after moving to Voldemort 1.3 from 0.96, where it was climbing for 48 hours and then dropping, and the cycle was repeating.
The CPU average load climbed 5x, peak ones climbed even higher.

After doing some research on what default properties were changed from previous version, I tried to set the following properties and it now works much better, 

Here are things that I have overridden:

bdb.cleaner.lazy.migration=true
bdb.cleaner.threads=5 (increased from 3 threads before)
bdb.cache.evictln=false

Memory utilization still has the same behavior, however the cycle is about 6 hours, compared to 48 hours before.
CPU utilization is even lower than it was with 0.96, which is great!!!


Below is the full list of properties that I use (Note the sections for SSD and NON-SSD):

enable.bdb.engine=true
bdb.one.env.per.store=true
bdb.sync.transactions=false
bdb.cache.size=12GB
bdb.max.logfile.size=10MB
bdb.cleaner.min.file.utilization=50

##########################################
# Use if cluster backed by spinning disks!
##########################################
bdb.cleaner.lazy.migration=true
bdb.cleaner.threads=5
bdb.cache.evictln=false
### Leave bdb.evict.by.level undefined

##########################################
# Use if cluster backed by SSD!
##########################################
### bdb.cleaner.threads=1
### bdb.evict.by.level=true
### Leave bdb.cleaner.lazy.migration undefined
### Leave bdb.cache.evictln undefined


1. What do you think about this combination? Do we need to tune anything else?
2. I am also curios about tuning the BDB cache size, is it always better to have it larger? It probably depends on Cache eviction policy that is used, but I have generally seen that behavior with Garbage Collection, when having larger heap doesn't always help.
Any thoughts on that?


Thanks in advance, I really appreciate your responses!


Sincerely,
David Gevorkyan
- eHarmony

Justin Mason

unread,
Feb 11, 2014, 6:37:41 AM2/11/14
to project-voldemort
Hi David --

We spent some time optimising our settings after a 0.90 -> 1.3 upgrade -- we too had a disk-based Vol cluster which didn't perform well post-upgrade without tweaks.  I think the 1.3 settings (and 1.3 in general!) are particularly optimised for SSDs.

It sounds like your cleaner threads kicked in.  A good JMX metric to watch is the cleaner backlog -- it's under the voldemort.store.bdb.stats MBean, iirc.  Here's what we used for cleaner-thread settings:

bdb.cleaner.threads=6
bdb.cleaner.min.file.utilization=5
bdb.cleaner.lazy.migration=true

i.e., lots of threads, lazy-migration off (works better for SSD), and start cleaner at 5% utilisation instead of waiting (this was a setting changed in 1.3.0).

We eventually switched to SSDs, as the (astronomical) improvement in latencies and load capacity was worth the additional $$$.  It's worth noting that with SSDs there's sufficient IOPS to make problems with the cleaner backlog more-or-less a thing of the past  ;)

--j.


--
You received this message because you are subscribed to the Google Groups "project-voldemort" group.
To unsubscribe from this group and stop receiving emails from it, send an email to project-voldem...@googlegroups.com.
Visit this group at http://groups.google.com/group/project-voldemort.
For more options, visit https://groups.google.com/groups/opt_out.

Brendan Harris (a.k.a. stotch on irc.oftc.net)

unread,
Feb 11, 2014, 10:56:52 AM2/11/14
to project-...@googlegroups.com, j...@jmason.org
David,

Your settings look good for both types of configurations, but I am not sure bdb.cleaner.threads 5 will really make any difference (we have not really seen a difference between 3 and 5 ourselves on spinning disks). We set this to 3 on our spinning disk environments because it was the biggest improvement we could see before hitting a cliff. More threads than this saw little improvement and just consumed more heap space and added more CPU time.

Justin,
 
bdb.cleaner.min.file.utilization=5

I don't advise setting this to this value. I advise setting it to 0. We have seen this setting cause cleaner conflict, where the cleaners are trying to gauge the overall environment utilization (bdb.cleaner.minUtilization) and weigh the per-file utilization. When we defined values for this that were greater than 0, we often found that the cleaner backlog would fall behind. One of the engineers on the Sleepycat dev team in Oracle advised us to set this to 0 and then set bdb.cleaner.minUtilization to whatever overall environment utilization we wanted. And that worked out well for us. We set bdb.cleaner.minUtilization to 50 (default, now). And we found that this setting worked well both for spinning disks and SSD.

One last note for everyone on this thread is that in bdb-je 5, some improvements were made to the bdb file format to reduce the amount of metadata, which reduced the amount of data needing to be written per file and subsequently reduces the amount of cleaner activity (significantly.) After upgrading to bdb-je 5, all of our straggling bdb cleaner quirks went away and suddenly the bdb environment started behaving the way we configured it to (unlike before, where it seemed like all of the bdb settings had magical properties ;-)

~B

Justin Mason

unread,
Feb 11, 2014, 11:32:50 AM2/11/14
to project-voldemort
On Tue, Feb 11, 2014 at 3:56 PM, Brendan Harris (a.k.a. stotch on irc.oftc.net) <dre...@gmail.com> wrote:
Justin,
 
bdb.cleaner.min.file.utilization=5

I don't advise setting this to this value. I advise setting it to 0. We have seen this setting cause cleaner conflict, where the cleaners are trying to gauge the overall environment utilization (bdb.cleaner.minUtilization) and weigh the per-file utilization. When we defined values for this that were greater than 0, we often found that the cleaner backlog would fall behind. One of the engineers on the Sleepycat dev team in Oracle advised us to set this to 0 and then set bdb.cleaner.minUtilization to whatever overall environment utilization we wanted. And that worked out well for us. We set bdb.cleaner.minUtilization to 50 (default, now). And we found that this setting worked well both for spinning disks and SSD.

Good to know.  We're now using 0/50 with SSDs and it's working well, alright.
 
One last note for everyone on this thread is that in bdb-je 5, some improvements were made to the bdb file format to reduce the amount of metadata, which reduced the amount of data needing to be written per file and subsequently reduces the amount of cleaner activity (significantly.) After upgrading to bdb-je 5, all of our straggling bdb cleaner quirks went away and suddenly the bdb environment started behaving the way we configured it to (unlike before, where it seemed like all of the bdb settings had magical properties ;-)

OK, that sounds like very good news.  Upgrade time for sure ;)

--j.

David Gevorkyan

unread,
Feb 14, 2014, 4:37:35 PM2/14/14
to project-...@googlegroups.com, j...@jmason.org
Brendan,

Seems that 1.3.0 is using BDB JE 4.1.17.
Is there a specific distribution of 1.3.0 that has JE 5 dependency, or you expect us to go directly to 1.6.0 in order to have it?


Sincerely,
David

Brendan Harris (a.k.a. stotch on irc.oftc.net)

unread,
Feb 14, 2014, 4:43:23 PM2/14/14
to project-...@googlegroups.com, j...@jmason.org
Hi David,

On Friday, February 14, 2014 1:37:35 PM UTC-8, David Gevorkyan wrote:
Brendan,

Seems that 1.3.0 is using BDB JE 4.1.17.
Is there a specific distribution of 1.3.0 that has JE 5 dependency, or you expect us to go directly to 1.6.0 in order to have it?

I don't remember if it's in 1.4.x or 1.5.x that has bdb-je 5, as I don't have the repo in front of me right now. But I'd suggest going to 1.6.0, as that is the official production release.

Vinoth C

unread,
Feb 15, 2014, 6:35:48 PM2/15/14
to project-...@googlegroups.com, j...@jmason.org
Yes 1.6 has bdb5... There are some additional configs put in for BDB5. So i recommend going to 1.6
Reply all
Reply to author
Forward
0 new messages