> Is it a good practice to turn on balancer after large data import and turn
> it off before another import?
That depends - if your cluster has the capacity to both balance and do the
import then that is preferred, since the writes are more likely to be more
evenly distributed and the data is balanced in increments rather than a
large batch (which can take a long time - each shard can take part in only
a single migration at a time). If the import is causing extremely heavy
load, then turning off the balancer can help by freeing up the resources
used to balance. It's a judgement call based on your cluster, your needs
etc.
Documentation says that balancer window must be sufficient to complete the
> migration. What happens when balancer will not migrate all data?
The balancer will run until all in-flight migrations are complete, then
stop - your data will remain in that state until you turn the balancer on
again. If you do another import in the meantime, then the data will become
more unbalanced and you will essentially repeat this pattern forever (i.e.
your data will never be balanced). Hence the note in the docs - it won't
break anything per se, but your data will remain unbalanced from a shard
perspective. You can see the chunk distribution with sh.status() from the
shell.
Adam
On Tuesday, September 4, 2012 8:46:29 AM UTC+1, mthenw wrote:
> Thanks for your answer and I have another two.
> Is it a good practice to turn on balancer after large data import and turn
> it off before another import?
> Documentation says that balancer window must be sufficient to complete the
> migration. What happens when balancer will not migrate all data?
> On Monday, September 3, 2012 4:41:27 PM UTC+2, Scott Hernandez wrote:
>> If you aren't using the balancer then splitting does not result in
>> anything moving shards. Splitting chunks is a completely logical
>> operation -- you need to then move the chunks to distribute them.
>> You may want to disable the balancer while importing, and then enable
>> it later to evenly distribute the chunks after the import.
>> On Mon, Sep 3, 2012 at 9:20 AM, mthenw <maciej....@gmail.com> wrote:
>> > Hi,
>> > I need some clarification about sharding, pre-splitting and chunks
>> > balancing. I have 2 shards with balancer turned off and presplit set
>> > db.runCommand( { split : "example.users" , middle : { _id : 5000 } } )
>> > I assume that every document with _id less than 5000 will go to shard 1
>> and
>> > every document with _id greater than 5000 will go to shard 2.
>> > My question is:
>> > if I want to add another shard first I need to move chunks manually and
>> then
>> > change split options or changing split options will cause automatic
>> chunk
>> > migration?
>> > PS I don't want use balancer because it slows down while importing
>> large
>> > data sets.
>> > --
>> > You received this message because you are subscribed to the Google
>> > Groups "mongodb-user" group.
>> > To post to this group, send email to mongod...@googlegroups.com
>> > To unsubscribe from this group, send email to
>> > mongodb-user...@googlegroups.com
>> > See also the IRC channel -- freenode.net#mongodb