Hi,
I'm not exactly sure if I understood correctly what you would like to do, but here are some thoughts:
In my opinion, the aggregators work quite well with the current model of the iterations. The barrier is already there at the end of the superstep, so I don't see a big overhead there.
If at some point we support asynchronous iterations, then we might want to reconsider.
I see how the "best effort" approach could be helpful in some cases, for example, when a termination criterion depends on a threshold, you might not want to wait for all the updates. However, this might also cost you one (or more) extra iteration(s).
I believe that if you go for this approach, it would be nice to have both choices, i.e. something like a "blocking" and a "non-blocking / best-effort" implementation and let the user choose, depending on application requirements.
The same could apply for non-iterative programs and accumulators. If you want to retrieve the partial result of an accumulator, before job completion, then you could have a "best-effort" call, that will return immediately the current aggregated value or provide a "blocking" call, that would introduce a barrier there.
What do you think?
Cheers,
V.