I have a problem in which a bunch of refinement happens in a small
region of space, and it's running single nodes out of memory on a
problem that otherwise should be fine when spread over the whole
allocation. (At least, I believe this is the problem...)
Currently, InterpolateFieldValues and
CommunicationCollectParticles are both called _before_ the load
balancing step. I'm not as familiar with the particle routine, but
IFV is where the baryon fields are allocated, which in this run is the
only memory cost. The machinery seems to be there to allow for
off-processor children. Has anyone played with switching the order of
the load balancing and the actual interpolation? This would be a
speed hit, but it would lower my overall cost, since I'm presently
only using 1/8 tasks per node to get the memory.
d.
--
Sent from my computer.