Speeding up Rebalance

21 views
Skip to first unread message

Daniel Leaberry

unread,
Feb 10, 2012, 2:40:28 PM2/10/12
to mog...@googlegroups.com
I am end of lifing some of my older machines. On my dev system I
upgraded everything to 2.57 and I set the old devices to drain. I then
setup a rebalance run with the following policy.

mogadm rebalance policy --options="from_hosts=3,4 to_hosts=8,9"

Hosts 3 and 4 each have 12 750GB drives, the new 2 hosts have about the
same.

I'm watching rebalance and it appears to only pick one device at a time
to drain/rebalance. This will take forever when I run it on the
production system with 288 750GB drives.

Is there any way to parallelize the source devices? I tried bumping up
the replicate threads but it made no difference. I have enough temporary
trackers for this move to really move data fast, my DB being on ssd's
also helps.

Thanks,
Daniel

dormando

unread,
Feb 10, 2012, 2:45:58 PM2/10/12
to mog...@googlegroups.com

There's no way yet :( When I rewrote rebalance for 2.40+ it got far faster
than it was before, but making it parallel is tricky to do safely, so I
punted for later.

Have you gone through all of the performance tuning guides on the wiki
related to FSCK (they're the same with rebal, just slightly different
names for some of the server settings)? There're a number of twiddles to
move that speed things up to the point where the source drive is saturated
issuing DELETE's. At that point it can take a while still, but will be
going much much faster than with the default settings.

Daniel Leaberry

unread,
Feb 10, 2012, 2:58:07 PM2/10/12
to mog...@googlegroups.com
On 02/10/2012 12:45 PM, dormando wrote:

>> Is there any way to parallelize the source devices? I tried bumping up the
>> replicate threads but it made no difference. I have enough temporary trackers
>> for this move to really move data fast, my DB being on ssd's also helps.
>
> There's no way yet :( When I rewrote rebalance for 2.40+ it got far faster
> than it was before, but making it parallel is tricky to do safely, so I
> punted for later.
>
> Have you gone through all of the performance tuning guides on the wiki
> related to FSCK (they're the same with rebal, just slightly different
> names for some of the server settings)? There're a number of twiddles to
> move that speed things up to the point where the source drive is saturated
> issuing DELETE's. At that point it can take a while still, but will be
> going much much faster than with the default settings.

Bummer. I'm a master at tuning FSCK. 18 months ago we needed another
copy of our data for another data center. With the DB on ssd's and 12
trackers dedicated to replicating files I was pushing 20k fids/sec and
500MB/sec in data to the new machines. I've tuned rebalance but once the
single dev is maxed out that's it for speed improvements.

So, what are my options for speeding things up? Can I just mark the
devices dead and then run an FSCK? I could probably do batches of 4-6 12
drive machines at a time without losing copies (we keep 3).

I suppose I could also leave my production system on 2.37 and just use
the old drain method. I'm pretty sure that's the same as marking devices
dead and using FSCK though.

Daniel

dormando

unread,
Feb 10, 2012, 3:13:32 PM2/10/12
to mog...@googlegroups.com
> Bummer. I'm a master at tuning FSCK. 18 months ago we needed another copy of
> our data for another data center. With the DB on ssd's and 12 trackers
> dedicated to replicating files I was pushing 20k fids/sec and 500MB/sec in
> data to the new machines. I've tuned rebalance but once the single dev is
> maxed out that's it for speed improvements.
>
> So, what are my options for speeding things up? Can I just mark the devices
> dead and then run an FSCK? I could probably do batches of 4-6 12 drive
> machines at a time without losing copies (we keep 3).

Marking a drive as dead quickly re-replicates all of its stuff. If you
intend to fully retire those machines though, it's fine to just mark them
as dead.

I would run a FSCK first; if FSCK thinks all your files are replicated
properly, mark all the devices on one host as dead, wait for mogstats to
show that the host has no rows left, and replication has caught up, then
mark the other host dead. I wouldn't do both at once though.

The reaper is pretty aggressive, so a few devices at once should be
difficult enough to keep up with :) Rebalance is only good if you want to
keep the drives anyway.

> I suppose I could also leave my production system on 2.37 and just use the old
> drain method. I'm pretty sure that's the same as marking devices dead and
> using FSCK though.

The old method may end up nuking files :/ 2.57 should be quicker at
replicating since it hits the DB less often anyway.

Daniel Leaberry

unread,
Feb 10, 2012, 3:17:57 PM2/10/12
to mog...@googlegroups.com
On 02/10/2012 01:13 PM, dormando wrote:
>
> I would run a FSCK first; if FSCK thinks all your files are replicated
> properly, mark all the devices on one host as dead, wait for mogstats to
> show that the host has no rows left, and replication has caught up, then
> mark the other host dead. I wouldn't do both at once though.
>
> The reaper is pretty aggressive, so a few devices at once should be
> difficult enough to keep up with :) Rebalance is only good if you want to
> keep the drives anyway.
>

Sounds like a plan. I appreciate your help on this matter. MogileFS has
been the best filesystem we've ever used for web based 100% uptime storage.

Thanks,
Daniel

Reply all
Reply to author
Forward
0 new messages