Hello,
I have a disco cluster running in AWS, made of 8 slave nodes and 1 master. I need to replace all 8 slave nodes with 8 new slave nodes. My plan was to create the new nodes and add them to the cluster, then blacklist old nodes 1 at a time, waiting for them to turn green in the web interface before blacklist another node.
I got as far as blacklisting my first node (for both disco and ddfs), but it seems to be taking forever for replication complete. I'm manually triggering garbage collection runs, each of which seems to take < minute, but I've done this at least a dozen times and I feel like I'm doing something wrong. I looked in the master logs to determine the replication status, and found that each garbage collection run only seems to replicate 114 blobs. Each of the nodes has as many as 7k blobs, and I don't want to have to trigger 70 gc runs just to speed up migration of the node.
Is there any way force replication to actively continue until I'm back at 3 replicas per blob?
Also, if I blacklist all 8 old nodes at once (in ddfs as disco, or just in ddfs), will the master replicate data from the blacklisted nodes to the new nodes, or will replication fail because I have no non-blacklist sources for the blobs?
-Daniel Thornton