iter_size is a way to effectively increase the batch size without requiring the extra GPU memory. If you have an iter_size of 10, then the gradients will be accumulated for 10 training iterations and then the weights will only be updated once. If you have an iter_size of 1, then the weights are updated with the gradient every training iteration.
The relationship between iter_size and the learning rate is not that straightforward. Increasing the batch size decreases the variance of the gradient estimate. If you decrease the learning rate instead of increasing the iter_size, you will be making more smaller steps, sometimes in the direction of the estimated gradient and sometimes away.
Say someone creates a model with a much more expensive GPU than what you have and they use a batch size of 1024 and iter_size of 1. If your GPU only supports a batch size of 128, then you can set the iter_size to 8 and train their model with the same learning parameters (although this may effect batch normalization, if this is used).
Cheers,
Jonathan