DDFS rebalance and over-replication?

29 views
Skip to first unread message

slowe...@gmail.com

unread,
Feb 26, 2015, 12:50:29 PM2/26/15
to disc...@googlegroups.com
Hi,

I'm seeing what seems like too many replicas in ddfs.  Here are the related disco settings I'm setting:

DDFS_TAG_MIN_REPLICAS = 4
DDFS_TAG_REPLICAS     = 4
DDFS_BLOB_REPLICAS    = 4
DDFS_SPACE_AWARE      = "true"
DDFS_ABSOLUTE_SPACE   = "true"

I didn't change NUM_EXTRA_REPLICAS so that is still default, 1.  Based on this I'd expect most of my blobs to have 4 or 5 replicas.  However over many days as GC runs several times I see most blobs have 7 or 8 replicas.

I do see that when a node has less space than others, a lot of blobs will get deleted from it.  That's fine since I have SPACE_AWARE on.

But when gc runs I see a lot of log messages like:
@ddfs_gc_main:rereplicate_blob:1274 GC: rr for <<myblobname>> (with 7/0 replicas recorded/recovered) initiated
I don't understand why it wants to rereplicate a blob that already has enough replicas.  Making extra replicas seems like a waste of time and disk space.

Is there some other setting or option I should be looking at in ddfs, or can you point me in a direction so I can figure out what I might be doing wrong?

Also, some nodes have less total disk space than others.  Is this a problem for rebalancing?

Thanks
Reply all
Reply to author
Forward
0 new messages