On a 10 node cluster of 9T nodes, I ran about a week per node, but I ran extremely low priority and paused a few times during the day.
My users were extremely sensitive to performance impacts and wanted to not run at high priority and pause the smartfail on certain intervals where there was concern.
I think the fastest I could do was 12-24 hours a node at the time.
What we did was add 14 new nodes into the cluster when we did then let autobalance run.... this took quite a while because we didn't care how long it took (low priority).
That saved a lot of movement for smartfail because the large number of new nodes brought the total amount of data on each node much lower ne
At the time we also modified a few things in XML config files to prevent the new nodes from "bullying" the old nodes. I'm not sure if this is still the recommended case, but the SEs usually have pretty good procedures around this.