import emcee import numpy as np pool = emcee.utils.MPIPool(debug=True) def f(x): return np.empty(2500) X = np.empty([100,50000]) Y = np.array(pool.map(f, [x for x in X])) pool.close()
While communicating pickled streams do have a 2GB size limit, I do not
think you are reaching that point. I've tried to run you code with 16
processes in my workstation (8 cores, 24GB RAM) and it runs just fine
(using development mpi4py from bitbucket).
Could you try with a smaller X array, let say X = np.empty([100,500]) ?
Do you still get the error?
How much RAM do you have at compute nodes? What mpi4py
version and backend MPI implementation are you using?
Lisandro, thank you so much for your effort! I really appreciate it. I'm far from my computer right now, but soon I have time I'll check the pool bug you mentioned.
I reply with more news.
Best,
Júlio.
Hi Lisandro,
Are you sure this is the source of the issue? Do you suggest any patch to that?
Anyways, incredible catch. I forgot MPI buffers the messages in such situations.
Júlio.
Uff, proposing a patch is probably much harder for me thatimplementing this master/worker approach from scratch. Please find
attached my own version, I've not tested it, but I think it is right.